Index (Spark 2.4.5 JavaDoc)

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z _

A

abort(WriterCommitMessage[]) - Method in interface org.apache.spark.sql.sources.v2.writer.DataSourceWriter: Aborts this writing job because some data writers are failed and keep failing when retry, or the Spark job fails with some unknown reasons, or DataSourceWriter.onDataWriterCommit(WriterCommitMessage) fails, or DataSourceWriter.commit(WriterCommitMessage[]) fails.
abort() - Method in interface org.apache.spark.sql.sources.v2.writer.DataWriter: Aborts this writer if it is failed.
abort(long, WriterCommitMessage[]) - Method in interface org.apache.spark.sql.sources.v2.writer.streaming.StreamWriter: Aborts this writing job because some data writers are failed and keep failing when retried, or the Spark job fails with some unknown reasons, or StreamWriter.commit(WriterCommitMessage[]) fails.
abort(WriterCommitMessage[]) - Method in interface org.apache.spark.sql.sources.v2.writer.streaming.StreamWriter
abortJob(JobContext) - Method in class org.apache.spark.internal.io.FileCommitProtocol: Aborts a job after the writes fail.
abortJob(JobContext) - Method in class org.apache.spark.internal.io.HadoopMapReduceCommitProtocol
abortTask(TaskAttemptContext) - Method in class org.apache.spark.internal.io.FileCommitProtocol: Aborts a task after the writes have failed.
abortTask(TaskAttemptContext) - Method in class org.apache.spark.internal.io.HadoopMapReduceCommitProtocol
abs(Column) - Static method in class org.apache.spark.sql.functions: Computes the absolute value of a numeric value.
abs() - Method in class org.apache.spark.sql.types.Decimal
absent() - Static method in class org.apache.spark.api.java.Optional
AbsoluteError - Class in org.apache.spark.mllib.tree.loss: :: DeveloperApi :: Class for absolute error loss calculation (for regression).
AbsoluteError() - Constructor for class org.apache.spark.mllib.tree.loss.AbsoluteError
AbstractLauncher<T extends AbstractLauncher<T>> - Class in org.apache.spark.launcher: Base class for launcher implementations.
accept(Parsers) - Static method in class org.apache.spark.ml.feature.RFormulaParser
accept(ES, Function1<ES, List<Object>>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
accept(String, PartialFunction<Object, U>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
accept(Path) - Method in class org.apache.spark.ml.image.SamplePathFilter
acceptIf(Function1<Object, Object>, Function1<Object, String>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
acceptMatch(String, PartialFunction<Object, U>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
acceptSeq(ES, Function1<ES, Iterable<Object>>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
acceptsType(DataType) - Method in class org.apache.spark.sql.types.ObjectType
accId() - Method in class org.apache.spark.CleanAccum
accumCleaned(long) - Method in interface org.apache.spark.CleanerListener
Accumulable<R,T> - Class in org.apache.spark: Deprecated.
use AccumulatorV2. Since 2.0.0.
Accumulable(R, AccumulableParam<R, T>) - Constructor for class org.apache.spark.Accumulable: Deprecated.
accumulable(T, AccumulableParam<T, R>) - Method in class org.apache.spark.api.java.JavaSparkContext: Deprecated.
use AccumulatorV2. Since 2.0.0.
accumulable(T, String, AccumulableParam<T, R>) - Method in class org.apache.spark.api.java.JavaSparkContext: Deprecated.
use AccumulatorV2. Since 2.0.0.
accumulable(R, AccumulableParam<R, T>) - Method in class org.apache.spark.SparkContext: Deprecated.
use AccumulatorV2. Since 2.0.0.
accumulable(R, String, AccumulableParam<R, T>) - Method in class org.apache.spark.SparkContext: Deprecated.
use AccumulatorV2. Since 2.0.0.
accumulableCollection(R, Function1<R, Growable<T>>, ClassTag<R>) - Method in class org.apache.spark.SparkContext: Deprecated.
use AccumulatorV2. Since 2.0.0.
AccumulableInfo - Class in org.apache.spark.scheduler: :: DeveloperApi :: Information about an Accumulable modified during a task or stage.
AccumulableInfo - Class in org.apache.spark.status.api.v1
accumulableInfoFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
accumulableInfoToJson(AccumulableInfo) - Static method in class org.apache.spark.util.JsonProtocol
AccumulableParam<R,T> - Interface in org.apache.spark: Deprecated.
use AccumulatorV2. Since 2.0.0.
accumulables() - Method in class org.apache.spark.scheduler.StageInfo: Terminal values of accumulables updated during this stage, including all the user-defined accumulators.
accumulables() - Method in class org.apache.spark.scheduler.TaskInfo: Intermediate updates to accumulables during this task.
accumulablesToJson(Traversable<AccumulableInfo>) - Static method in class org.apache.spark.util.JsonProtocol
Accumulator<T> - Class in org.apache.spark: Deprecated.
use AccumulatorV2. Since 2.0.0.
accumulator(int) - Method in class org.apache.spark.api.java.JavaSparkContext: Deprecated.
use sc().longAccumulator(). Since 2.0.0.
accumulator(int, String) - Method in class org.apache.spark.api.java.JavaSparkContext: Deprecated.
use sc().longAccumulator(String). Since 2.0.0.
accumulator(double) - Method in class org.apache.spark.api.java.JavaSparkContext: Deprecated.
use sc().doubleAccumulator(). Since 2.0.0.
accumulator(double, String) - Method in class org.apache.spark.api.java.JavaSparkContext: Deprecated.
use sc().doubleAccumulator(String). Since 2.0.0.
accumulator(T, AccumulatorParam<T>) - Method in class org.apache.spark.api.java.JavaSparkContext: Deprecated.
use AccumulatorV2. Since 2.0.0.
accumulator(T, String, AccumulatorParam<T>) - Method in class org.apache.spark.api.java.JavaSparkContext: Deprecated.
use AccumulatorV2. Since 2.0.0.
accumulator(T, AccumulatorParam<T>) - Method in class org.apache.spark.SparkContext: Deprecated.
use AccumulatorV2. Since 2.0.0.
accumulator(T, String, AccumulatorParam<T>) - Method in class org.apache.spark.SparkContext: Deprecated.
use AccumulatorV2. Since 2.0.0.
AccumulatorContext - Class in org.apache.spark.util: An internal class used to track accumulators by Spark itself.
AccumulatorContext() - Constructor for class org.apache.spark.util.AccumulatorContext
AccumulatorParam<T> - Interface in org.apache.spark: Deprecated.
use AccumulatorV2. Since 2.0.0.
AccumulatorParam.DoubleAccumulatorParam$ - Class in org.apache.spark: Deprecated.
use AccumulatorV2. Since 2.0.0.
AccumulatorParam.FloatAccumulatorParam$ - Class in org.apache.spark: Deprecated.
use AccumulatorV2. Since 2.0.0.
AccumulatorParam.IntAccumulatorParam$ - Class in org.apache.spark: Deprecated.
use AccumulatorV2. Since 2.0.0.
AccumulatorParam.LongAccumulatorParam$ - Class in org.apache.spark: Deprecated.
use AccumulatorV2. Since 2.0.0.
AccumulatorParam.StringAccumulatorParam$ - Class in org.apache.spark: Deprecated.
use AccumulatorV2. Since 2.0.0.
ACCUMULATORS() - Static method in class org.apache.spark.status.TaskIndexNames
accumulatorUpdates() - Method in class org.apache.spark.status.api.v1.StageData
accumulatorUpdates() - Method in class org.apache.spark.status.api.v1.TaskData
AccumulatorV2<IN,OUT> - Class in org.apache.spark.util: The base class for accumulators, that can accumulate inputs of type IN, and produce output of type OUT.
AccumulatorV2() - Constructor for class org.apache.spark.util.AccumulatorV2
accumUpdates() - Method in class org.apache.spark.ExceptionFailure
accumUpdates() - Method in class org.apache.spark.scheduler.SparkListenerExecutorMetricsUpdate
accumUpdates() - Method in class org.apache.spark.TaskKilled
accuracy() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary: Returns accuracy.
accuracy() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns accuracy (equals to the total number of correctly classified instances out of the total number of instances.)
accuracy() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics: Returns accuracy
acos(Column) - Static method in class org.apache.spark.sql.functions
acos(String) - Static method in class org.apache.spark.sql.functions
ActivationFunction - Interface in org.apache.spark.ml.ann: Trait for functions and their derivatives for functional layers
active() - Static method in class org.apache.spark.sql.SparkSession: Returns the currently active SparkSession, otherwise the default one.
active() - Method in class org.apache.spark.sql.streaming.StreamingQueryManager: Returns a list of active queries associated with this SQLContext
active() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
ACTIVE() - Static method in class org.apache.spark.streaming.scheduler.ReceiverState
activeStages() - Method in class org.apache.spark.status.LiveJob
activeTasks() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
activeTasks() - Method in class org.apache.spark.status.LiveExecutor
activeTasks() - Method in class org.apache.spark.status.LiveJob
activeTasks() - Method in class org.apache.spark.status.LiveStage
activeTasksPerExecutor() - Method in class org.apache.spark.status.LiveStage
add(T) - Method in class org.apache.spark.Accumulable: Deprecated.

Add more data to this accumulator / accumulable
add(Vector) - Method in class org.apache.spark.ml.clustering.ExpectationAggregator: Add a new training instance to this ExpectationAggregator, update the weights, means and covariances for each distributions, and update the log likelihood.
add(Datum) - Method in interface org.apache.spark.ml.optim.aggregator.DifferentiableLossAggregator: Add a single data point to this aggregator.
add(AFTPoint) - Method in class org.apache.spark.ml.regression.AFTAggregator: Add a new training data to this AFTAggregator, and update the loss and gradient of the objective function.
add(double[], MultivariateGaussian[], ExpectationSum, Vector<Object>) - Static method in class org.apache.spark.mllib.clustering.ExpectationSum
add(Vector) - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator: Adds a new document.
add(BlockMatrix) - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix: Adds the given block matrix other to this block matrix: this + other.
add(Vector) - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer: Add a new sample to this summarizer, and update the statistical summary.
add(StructField) - Method in class org.apache.spark.sql.types.StructType: Creates a new StructType by adding a new field.
add(String, DataType) - Method in class org.apache.spark.sql.types.StructType: Creates a new StructType by adding a new nullable field with no metadata.
add(String, DataType, boolean) - Method in class org.apache.spark.sql.types.StructType: Creates a new StructType by adding a new field with no metadata.
add(String, DataType, boolean, Metadata) - Method in class org.apache.spark.sql.types.StructType: Creates a new StructType by adding a new field and specifying metadata.
add(String, DataType, boolean, String) - Method in class org.apache.spark.sql.types.StructType: Creates a new StructType by adding a new field and specifying metadata.
add(String, String) - Method in class org.apache.spark.sql.types.StructType: Creates a new StructType by adding a new nullable field with no metadata where the dataType is specified as a String.
add(String, String, boolean) - Method in class org.apache.spark.sql.types.StructType: Creates a new StructType by adding a new field with no metadata where the dataType is specified as a String.
add(String, String, boolean, Metadata) - Method in class org.apache.spark.sql.types.StructType: Creates a new StructType by adding a new field and specifying metadata where the dataType is specified as a String.
add(String, String, boolean, String) - Method in class org.apache.spark.sql.types.StructType: Creates a new StructType by adding a new field and specifying metadata where the dataType is specified as a String.
add(long, long) - Static method in class org.apache.spark.streaming.util.RawTextHelper
add(IN) - Method in class org.apache.spark.util.AccumulatorV2: Takes the inputs and accumulates.
add(T) - Method in class org.apache.spark.util.CollectionAccumulator
add(Double) - Method in class org.apache.spark.util.DoubleAccumulator: Adds v to the accumulator, i.e.
add(double) - Method in class org.apache.spark.util.DoubleAccumulator: Adds v to the accumulator, i.e.
add(T) - Method in class org.apache.spark.util.LegacyAccumulatorWrapper
add(Long) - Method in class org.apache.spark.util.LongAccumulator: Adds v to the accumulator, i.e.
add(long) - Method in class org.apache.spark.util.LongAccumulator: Adds v to the accumulator, i.e.
add(Object) - Method in class org.apache.spark.util.sketch.CountMinSketch: Increments item's count by one.
add(Object, long) - Method in class org.apache.spark.util.sketch.CountMinSketch: Increments item's count by count.
add_months(Column, int) - Static method in class org.apache.spark.sql.functions: Returns the date that is numMonths after startDate.
addAccumulator(R, T) - Method in interface org.apache.spark.AccumulableParam: Deprecated.

Add additional data to the accumulator value.
addAccumulator(T, T) - Method in interface org.apache.spark.AccumulatorParam: Deprecated.
addAppArgs(String...) - Method in class org.apache.spark.launcher.AbstractLauncher: Adds command line arguments for the application.
addAppArgs(String...) - Method in class org.apache.spark.launcher.SparkLauncher
addBinary(byte[]) - Method in class org.apache.spark.util.sketch.CountMinSketch: Increments item's count by one.
addBinary(byte[], long) - Method in class org.apache.spark.util.sketch.CountMinSketch: Increments item's count by count.
addDirectory(String, File) - Method in interface org.apache.spark.rpc.RpcEnvFileServer: Adds a local directory to be served via this file server.
addFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Add a file to be downloaded with this Spark job on every node.
addFile(String, boolean) - Method in class org.apache.spark.api.java.JavaSparkContext: Add a file to be downloaded with this Spark job on every node.
addFile(String) - Method in class org.apache.spark.launcher.AbstractLauncher: Adds a file to be submitted with the application.
addFile(String) - Method in class org.apache.spark.launcher.SparkLauncher
addFile(File) - Method in interface org.apache.spark.rpc.RpcEnvFileServer: Adds a file to be served by this RpcEnv.
addFile(String) - Method in class org.apache.spark.SparkContext: Add a file to be downloaded with this Spark job on every node.
addFile(String, boolean) - Method in class org.apache.spark.SparkContext: Add a file to be downloaded with this Spark job on every node.
addFilters(Seq<ServletContextHandler>, SparkConf) - Static method in class org.apache.spark.ui.JettyUtils: Add filters, if any, to the given list of ServletContextHandlers
addGrid(Param<T>, Iterable<T>) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder: Adds a param with multiple values (overwrites if the input param exists).
addGrid(DoubleParam, double[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder: Adds a double param with multiple values.
addGrid(IntParam, int[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder: Adds an int param with multiple values.
addGrid(FloatParam, float[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder: Adds a float param with multiple values.
addGrid(LongParam, long[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder: Adds a long param with multiple values.
addGrid(BooleanParam) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder: Adds a boolean param with true and false.
addInPlace(R, R) - Method in interface org.apache.spark.AccumulableParam: Deprecated.

Merge two accumulated values together.
addInPlace(double, double) - Method in class org.apache.spark.AccumulatorParam.DoubleAccumulatorParam$: Deprecated.
addInPlace(float, float) - Method in class org.apache.spark.AccumulatorParam.FloatAccumulatorParam$: Deprecated.
addInPlace(int, int) - Method in class org.apache.spark.AccumulatorParam.IntAccumulatorParam$: Deprecated.
addInPlace(long, long) - Method in class org.apache.spark.AccumulatorParam.LongAccumulatorParam$: Deprecated.
addInPlace(String, String) - Method in class org.apache.spark.AccumulatorParam.StringAccumulatorParam$: Deprecated.
addJar(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Adds a JAR dependency for all tasks to be executed on this SparkContext in the future.
addJar(String) - Method in class org.apache.spark.launcher.AbstractLauncher: Adds a jar file to be submitted with the application.
addJar(String) - Method in class org.apache.spark.launcher.SparkLauncher
addJar(File) - Method in interface org.apache.spark.rpc.RpcEnvFileServer: Adds a jar to be served by this RpcEnv.
addJar(String) - Method in class org.apache.spark.SparkContext: Adds a JAR dependency for all tasks to be executed on this SparkContext in the future.
addJar(String) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Add a jar into class loader
addJar(String) - Method in class org.apache.spark.sql.hive.HiveSessionResourceLoader
addListener(SparkAppHandle.Listener) - Method in interface org.apache.spark.launcher.SparkAppHandle: Adds a listener to be notified of changes to the handle's information.
addListener(StreamingQueryListener) - Method in class org.apache.spark.sql.streaming.StreamingQueryManager: Register a StreamingQueryListener to receive up-calls for life cycle events of StreamingQuery.
addListener(L) - Method in interface org.apache.spark.util.ListenerBus: Add a listener to listen events.
addLocalConfiguration(String, int, int, int, JobConf) - Static method in class org.apache.spark.rdd.HadoopRDD: Add Hadoop configuration specific to a single partition and attempt.
addLong(long) - Method in class org.apache.spark.util.sketch.CountMinSketch: Increments item's count by one.
addLong(long, long) - Method in class org.apache.spark.util.sketch.CountMinSketch: Increments item's count by count.
addMapOutput(int, MapStatus) - Method in class org.apache.spark.ShuffleStatus: Register a map output.
addMetrics(TaskMetrics, TaskMetrics) - Static method in class org.apache.spark.status.LiveEntityHelpers: Add m2 values to m1.
addPartition(LiveRDDPartition) - Method in class org.apache.spark.status.RDDPartitionSeq
addPartToPGroup(Partition, PartitionGroup) - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer
addPyFile(String) - Method in class org.apache.spark.launcher.AbstractLauncher: Adds a python file / zip / egg to be submitted with the application.
addPyFile(String) - Method in class org.apache.spark.launcher.SparkLauncher
address() - Method in class org.apache.spark.BarrierTaskInfo
address() - Method in class org.apache.spark.status.api.v1.RDDDataDistribution
addSchedulable(Schedulable) - Method in interface org.apache.spark.scheduler.Schedulable
addShutdownHook(Function0<BoxedUnit>) - Static method in class org.apache.spark.util.ShutdownHookManager: Adds a shutdown hook with default priority.
addShutdownHook(int, Function0<BoxedUnit>) - Static method in class org.apache.spark.util.ShutdownHookManager: Adds a shutdown hook with the given priority.
addSparkArg(String) - Method in class org.apache.spark.launcher.AbstractLauncher: Adds a no-value argument to the Spark invocation.
addSparkArg(String, String) - Method in class org.apache.spark.launcher.AbstractLauncher: Adds an argument with a value to the Spark invocation.
addSparkArg(String) - Method in class org.apache.spark.launcher.SparkLauncher
addSparkArg(String, String) - Method in class org.apache.spark.launcher.SparkLauncher
addSparkListener(SparkListenerInterface) - Method in class org.apache.spark.SparkContext: :: DeveloperApi :: Register a listener to receive up-calls from events that happen during execution.
addStreamingListener(StreamingListener) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Add a StreamingListener object for receiving system events related to streaming.
addStreamingListener(StreamingListener) - Method in class org.apache.spark.streaming.StreamingContext: Add a StreamingListener object for receiving system events related to streaming.
addString(String) - Method in class org.apache.spark.util.sketch.CountMinSketch: Increments item's count by one.
addString(String, long) - Method in class org.apache.spark.util.sketch.CountMinSketch: Increments item's count by count.
addTaskCompletionListener(TaskCompletionListener) - Method in class org.apache.spark.BarrierTaskContext
addTaskCompletionListener(TaskCompletionListener) - Method in class org.apache.spark.TaskContext: Adds a (Java friendly) listener to be executed on task completion.
addTaskCompletionListener(Function1<TaskContext, U>) - Method in class org.apache.spark.TaskContext: Adds a listener in the form of a Scala closure to be executed on task completion.
addTaskFailureListener(TaskFailureListener) - Method in class org.apache.spark.BarrierTaskContext
addTaskFailureListener(TaskFailureListener) - Method in class org.apache.spark.TaskContext: Adds a listener to be executed on task failure.
addTaskFailureListener(Function2<TaskContext, Throwable, BoxedUnit>) - Method in class org.apache.spark.TaskContext: Adds a listener to be executed on task failure.
addTaskSetManager(Schedulable, Properties) - Method in interface org.apache.spark.scheduler.SchedulableBuilder
addTime() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
addTime() - Method in class org.apache.spark.status.LiveExecutor
addURL(URL) - Method in class org.apache.spark.util.MutableURLClassLoader
AddWebUIFilter(String, Map<String, String>, String) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter
AddWebUIFilter$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter$
AFTAggregator - Class in org.apache.spark.ml.regression: AFTAggregator computes the gradient and loss for a AFT loss function, as used in AFT survival regression for samples in sparse or dense vector in an online fashion.
AFTAggregator(Broadcast<DenseVector<Object>>, boolean, Broadcast<double[]>) - Constructor for class org.apache.spark.ml.regression.AFTAggregator
AFTCostFun - Class in org.apache.spark.ml.regression: AFTCostFun implements Breeze's DiffFunction[T] for AFT cost.
AFTCostFun(RDD<AFTPoint>, boolean, Broadcast<double[]>, int) - Constructor for class org.apache.spark.ml.regression.AFTCostFun
AFTSurvivalRegression - Class in org.apache.spark.ml.regression: :: Experimental :: Fit a parametric survival regression model named accelerated failure time (AFT) model (see Accelerated failure time model (Wikipedia)) based on the Weibull distribution of the survival time.
AFTSurvivalRegression(String) - Constructor for class org.apache.spark.ml.regression.AFTSurvivalRegression
AFTSurvivalRegression() - Constructor for class org.apache.spark.ml.regression.AFTSurvivalRegression
AFTSurvivalRegressionModel - Class in org.apache.spark.ml.regression: :: Experimental :: Model produced by AFTSurvivalRegression.
AFTSurvivalRegressionParams - Interface in org.apache.spark.ml.regression: Params for accelerated failure time (AFT) regression.
agg(Column, Column...) - Method in class org.apache.spark.sql.Dataset: Aggregates on the entire Dataset without groups.
agg(Tuple2<String, String>, Seq<Tuple2<String, String>>) - Method in class org.apache.spark.sql.Dataset: (Scala-specific) Aggregates on the entire Dataset without groups.
agg(Map<String, String>) - Method in class org.apache.spark.sql.Dataset: (Scala-specific) Aggregates on the entire Dataset without groups.
agg(Map<String, String>) - Method in class org.apache.spark.sql.Dataset: (Java-specific) Aggregates on the entire Dataset without groups.
agg(Column, Seq<Column>) - Method in class org.apache.spark.sql.Dataset: Aggregates on the entire Dataset without groups.
agg(TypedColumn<V, U1>) - Method in class org.apache.spark.sql.KeyValueGroupedDataset: Computes the given aggregation, returning a Dataset of tuples for each unique key and the result of computing this aggregation over all elements in the group.
agg(TypedColumn<V, U1>, TypedColumn<V, U2>) - Method in class org.apache.spark.sql.KeyValueGroupedDataset: Computes the given aggregations, returning a Dataset of tuples for each unique key and the result of computing these aggregations over all elements in the group.
agg(TypedColumn<V, U1>, TypedColumn<V, U2>, TypedColumn<V, U3>) - Method in class org.apache.spark.sql.KeyValueGroupedDataset: Computes the given aggregations, returning a Dataset of tuples for each unique key and the result of computing these aggregations over all elements in the group.
agg(TypedColumn<V, U1>, TypedColumn<V, U2>, TypedColumn<V, U3>, TypedColumn<V, U4>) - Method in class org.apache.spark.sql.KeyValueGroupedDataset: Computes the given aggregations, returning a Dataset of tuples for each unique key and the result of computing these aggregations over all elements in the group.
agg(Column, Column...) - Method in class org.apache.spark.sql.RelationalGroupedDataset: Compute aggregates by specifying a series of aggregate columns.
agg(Tuple2<String, String>, Seq<Tuple2<String, String>>) - Method in class org.apache.spark.sql.RelationalGroupedDataset: (Scala-specific) Compute aggregates by specifying the column names and aggregate methods.
agg(Map<String, String>) - Method in class org.apache.spark.sql.RelationalGroupedDataset: (Scala-specific) Compute aggregates by specifying a map from column name to aggregate methods.
agg(Map<String, String>) - Method in class org.apache.spark.sql.RelationalGroupedDataset: (Java-specific) Compute aggregates by specifying a map from column name to aggregate methods.
agg(Column, Seq<Column>) - Method in class org.apache.spark.sql.RelationalGroupedDataset: Compute aggregates by specifying a series of aggregate columns.
aggregate(U, Function2<U, T, U>, Function2<U, U, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Aggregate the elements of each partition, and then the results for all the partitions, using given combine functions and a neutral "zero value".
aggregate(U, Function2<U, T, U>, Function2<U, U, U>, ClassTag) - Method in class org.apache.spark.rdd.RDD: Aggregate the elements of each partition, and then the results for all the partitions, using given combine functions and a neutral "zero value".
aggregateByKey(U, Partitioner, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD: Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, int, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD: Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD: Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, Partitioner, Function2<U, V, U>, Function2<U, U, U>, ClassTag) - Method in class org.apache.spark.rdd.PairRDDFunctions: Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, int, Function2<U, V, U>, Function2<U, U, U>, ClassTag) - Method in class org.apache.spark.rdd.PairRDDFunctions: Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, Function2<U, V, U>, Function2<U, U, U>, ClassTag) - Method in class org.apache.spark.rdd.PairRDDFunctions: Aggregate the values of each key, using given combine functions and a neutral "zero value".
AggregatedDialect - Class in org.apache.spark.sql.jdbc: AggregatedDialect can unify multiple dialects into one virtual Dialect.
AggregatedDialect(List<JdbcDialect>) - Constructor for class org.apache.spark.sql.jdbc.AggregatedDialect
aggregateMessages(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, ClassTag<A>) - Method in class org.apache.spark.graphx.Graph: Aggregates values from the neighboring edges and vertices of each vertex.
aggregateMessagesWithActiveSet(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, Option<Tuple2<VertexRDD<?>, EdgeDirection>>, ClassTag<A>) - Method in class org.apache.spark.graphx.impl.GraphImpl
aggregateUsingIndex(RDD<Tuple2<Object, VD2>>, Function2<VD2, VD2, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
aggregateUsingIndex(RDD<Tuple2<Object, VD2>>, Function2<VD2, VD2, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD: Aggregates vertices in messages that have the same ids using reduceFunc, returning a VertexRDD co-indexed with this.
AggregatingEdgeContext<VD,ED,A> - Class in org.apache.spark.graphx.impl
AggregatingEdgeContext(Function2<A, A, A>, Object, BitSet) - Constructor for class org.apache.spark.graphx.impl.AggregatingEdgeContext
aggregationDepth() - Method in interface org.apache.spark.ml.param.shared.HasAggregationDepth: Param for suggested depth for treeAggregate (>= 2).
Aggregator<K,V,C> - Class in org.apache.spark: :: DeveloperApi :: A set of functions used to aggregate data.
Aggregator(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Constructor for class org.apache.spark.Aggregator
aggregator() - Method in class org.apache.spark.ShuffleDependency
Aggregator<IN,BUF,OUT> - Class in org.apache.spark.sql.expressions: :: Experimental :: A base class for user-defined aggregations, which can be used in Dataset operations to take all of the elements of a group and reduce them to a single value.
Aggregator() - Constructor for class org.apache.spark.sql.expressions.Aggregator
aic(RDD<Tuple3<Object, Object, Object>>, double, double, double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Binomial$
aic(RDD<Tuple3<Object, Object, Object>>, double, double, double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Gamma$
aic(RDD<Tuple3<Object, Object, Object>>, double, double, double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Gaussian$
aic(RDD<Tuple3<Object, Object, Object>>, double, double, double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Poisson$
aic() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionSummary: Akaike Information Criterion (AIC) for the fitted model.
Algo - Class in org.apache.spark.mllib.tree.configuration: Enum to select the algorithm for the decision tree
Algo() - Constructor for class org.apache.spark.mllib.tree.configuration.Algo
algo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
algo() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
algo() - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
algo() - Method in class org.apache.spark.mllib.tree.model.RandomForestModel
algorithm() - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
alias(String) - Method in class org.apache.spark.sql.Column: Gives the column an alias.
alias(String) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset with an alias set.
alias(Symbol) - Method in class org.apache.spark.sql.Dataset: (Scala-specific) Returns a new Dataset with an alias set.
All - Static variable in class org.apache.spark.graphx.TripletFields: Expose all the fields (source, edge, and destination).
AllJobsCancelled - Class in org.apache.spark.scheduler
AllJobsCancelled() - Constructor for class org.apache.spark.scheduler.AllJobsCancelled
allocator() - Method in class org.apache.spark.storage.memory.SerializedValuesHolder
AllReceiverIds - Class in org.apache.spark.streaming.scheduler: A message used by ReceiverTracker to ask all receiver's ids still stored in ReceiverTrackerEndpoint.
AllReceiverIds() - Constructor for class org.apache.spark.streaming.scheduler.AllReceiverIds
allSources() - Static method in class org.apache.spark.metrics.source.StaticSources: The set of all static sources.
alpha() - Method in interface org.apache.spark.ml.recommendation.ALSParams: Param for the alpha parameter in the implicit preference formulation (nonnegative).
alpha() - Method in class org.apache.spark.mllib.random.WeibullGenerator
ALS - Class in org.apache.spark.ml.recommendation: Alternating Least Squares (ALS) matrix factorization.
ALS(String) - Constructor for class org.apache.spark.ml.recommendation.ALS
ALS() - Constructor for class org.apache.spark.ml.recommendation.ALS
ALS - Class in org.apache.spark.mllib.recommendation: Alternating Least Squares matrix factorization.
ALS() - Constructor for class org.apache.spark.mllib.recommendation.ALS: Constructs an ALS instance with default parameters: {numBlocks: -1, rank: 10, iterations: 10, lambda: 0.01, implicitPrefs: false, alpha: 1.0}.
ALS.InBlock$ - Class in org.apache.spark.ml.recommendation
ALS.LeastSquaresNESolver - Interface in org.apache.spark.ml.recommendation: Trait for least squares solvers applied to the normal equation.
ALS.Rating<ID> - Class in org.apache.spark.ml.recommendation: :: DeveloperApi :: Rating class for better code readability.
ALS.Rating$ - Class in org.apache.spark.ml.recommendation
ALS.RatingBlock$ - Class in org.apache.spark.ml.recommendation
ALSModel - Class in org.apache.spark.ml.recommendation: Model fitted by ALS.
ALSModelParams - Interface in org.apache.spark.ml.recommendation: Common params for ALS and ALSModel.
ALSParams - Interface in org.apache.spark.ml.recommendation: Common params for ALS.
alterDatabase(CatalogDatabase) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Alter a database whose name matches the one specified in database, assuming it exists.
alterFunction(String, CatalogFunction) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Alter a function whose name matches the one specified in `func`, assuming it exists.
alterPartitions(String, String, Seq<CatalogTablePartition>) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Alter one or more table partitions whose specs match the ones specified in newParts, assuming the partitions exist.
alterTable(CatalogTable) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Alter a table whose name matches the one specified in `table`, assuming it exists.
alterTable(String, String, CatalogTable) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Updates the given table with new metadata, optionally renaming the table or moving across different database.
alterTableDataSchema(String, String, StructType, Map<String, String>) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Updates the given table with a new data schema and table properties, and keep everything else unchanged.
am() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterClusterManager
AnalysisException - Exception in org.apache.spark.sql: Thrown when a query fails to analyze, usually because the query itself is invalid.
and(Column) - Method in class org.apache.spark.sql.Column: Boolean AND.
And - Class in org.apache.spark.sql.sources: A filter that evaluates to true iff both left or right evaluate to true.
And(Filter, Filter) - Constructor for class org.apache.spark.sql.sources.And
antecedent() - Method in class org.apache.spark.mllib.fpm.AssociationRules.Rule
ANY() - Static method in class org.apache.spark.scheduler.TaskLocality
AnyDataType - Class in org.apache.spark.sql.types: An AbstractDataType that matches any concrete data types.
AnyDataType() - Constructor for class org.apache.spark.sql.types.AnyDataType
anyNull() - Method in interface org.apache.spark.sql.Row: Returns true if there are any NULL values in this row.
anyNull() - Method in class org.apache.spark.sql.vectorized.ColumnarRow
ApiHelper - Class in org.apache.spark.ui.jobs
ApiHelper() - Constructor for class org.apache.spark.ui.jobs.ApiHelper
ApiRequestContext - Interface in org.apache.spark.status.api.v1
appAttemptId() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
Append() - Static method in class org.apache.spark.sql.streaming.OutputMode: OutputMode in which only the new rows in the streaming DataFrame/Dataset will be written to the sink.
appendBias(Vector) - Static method in class org.apache.spark.mllib.util.MLUtils: Returns a new vector with 1.0 (bias) appended to the input vector.
appendColumn(StructType, String, DataType, boolean) - Static method in class org.apache.spark.ml.util.SchemaUtils: Appends a new column to the input schema.
appendColumn(StructType, StructField) - Static method in class org.apache.spark.ml.util.SchemaUtils: Appends a new column to the input schema.
appendReadColumns(Configuration, Seq<Integer>, Seq<String>) - Static method in class org.apache.spark.sql.hive.HiveShim
AppHistoryServerPlugin - Interface in org.apache.spark.status: An interface for creating history listeners(to replay event logs) defined in other modules like SQL, and setup the UI of the plugin to rebuild the history UI.
appId() - Method in interface org.apache.spark.scheduler.SchedulerBackend
appId() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
appId() - Method in interface org.apache.spark.scheduler.TaskScheduler
appId() - Method in interface org.apache.spark.status.api.v1.BaseAppResource
APPLICATION_EXECUTOR_LIMIT() - Static method in class org.apache.spark.ui.ToolTips
applicationAttemptId() - Method in interface org.apache.spark.scheduler.SchedulerBackend: Get the attempt ID for this run, if the cluster manager supports multiple attempts.
applicationAttemptId() - Method in interface org.apache.spark.scheduler.TaskScheduler: Get an application's attempt ID associated with the job.
applicationAttemptId() - Method in class org.apache.spark.SparkContext
ApplicationAttemptInfo - Class in org.apache.spark.status.api.v1
applicationEndFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
applicationEndToJson(SparkListenerApplicationEnd) - Static method in class org.apache.spark.util.JsonProtocol
ApplicationEnvironmentInfo - Class in org.apache.spark.status.api.v1
applicationId() - Method in interface org.apache.spark.scheduler.SchedulerBackend: Get an application ID associated with the job.
applicationId() - Method in interface org.apache.spark.scheduler.TaskScheduler: Get an application ID associated with the job.
applicationId() - Method in class org.apache.spark.SparkContext: A unique identifier for the Spark application.
ApplicationInfo - Class in org.apache.spark.status.api.v1
applicationStartFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
applicationStartToJson(SparkListenerApplicationStart) - Static method in class org.apache.spark.util.JsonProtocol
ApplicationStatus - Enum in org.apache.spark.status.api.v1
apply(T1) - Static method in class org.apache.spark.CleanAccum
apply(T1) - Static method in class org.apache.spark.CleanBroadcast
apply(T1) - Static method in class org.apache.spark.CleanCheckpoint
apply(T1) - Static method in class org.apache.spark.CleanRDD
apply(T1) - Static method in class org.apache.spark.CleanShuffle
apply(T1, T2) - Static method in class org.apache.spark.ContextBarrierId
apply(T1, T2, T3, T4, T5, T6, T7) - Static method in class org.apache.spark.ExceptionFailure
apply(T1, T2, T3) - Static method in class org.apache.spark.ExecutorLostFailure
apply(T1) - Static method in class org.apache.spark.ExecutorRegistered
apply(T1) - Static method in class org.apache.spark.ExecutorRemoved
apply(T1, T2, T3, T4, T5) - Static method in class org.apache.spark.FetchFailed
apply(RDD<Tuple2<Object, VD>>, RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.Graph: Construct a graph from a collection of vertices and edges with attributes.
apply(RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl: Create a graph from edges, setting referenced vertices to defaultVertexAttr.
apply(RDD<Tuple2<Object, VD>>, RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl: Create a graph from vertices and edges, setting missing vertices to defaultVertexAttr.
apply(VertexRDD<VD>, EdgeRDD<ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl: Create a graph from a VertexRDD and an EdgeRDD with arbitrary replicated vertices.
apply(Graph<VD, ED>, A, int, EdgeDirection, Function3<Object, VD, A, VD>, Function1<EdgeTriplet<VD, ED>, Iterator<Tuple2<Object, A>>>, Function2<A, A, A>, ClassTag<VD>, ClassTag<ED>, ClassTag<A>) - Static method in class org.apache.spark.graphx.Pregel: Execute a Pregel-like iterative vertex-parallel abstraction.
apply(RDD<Tuple2<Object, VD>>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD: Constructs a standalone VertexRDD (one that is not set up for efficient joins with an EdgeRDD) from an RDD of vertex-attribute pairs.
apply(RDD<Tuple2<Object, VD>>, EdgeRDD<?>, VD, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD: Constructs a VertexRDD from an RDD of vertex-attribute pairs.
apply(RDD<Tuple2<Object, VD>>, EdgeRDD<?>, VD, Function2<VD, VD, VD>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD: Constructs a VertexRDD from an RDD of vertex-attribute pairs.
apply(DenseMatrix<Object>, DenseMatrix<Object>, Function1<Object, Object>) - Static method in class org.apache.spark.ml.ann.ApplyInPlace
apply(DenseMatrix<Object>, DenseMatrix<Object>, DenseMatrix<Object>, Function2<Object, Object, Object>) - Static method in class org.apache.spark.ml.ann.ApplyInPlace
apply(String) - Method in class org.apache.spark.ml.attribute.AttributeGroup: Gets an attribute by its name.
apply(int) - Method in class org.apache.spark.ml.attribute.AttributeGroup: Gets an attribute by its index.
apply(T1, T2) - Static method in class org.apache.spark.ml.clustering.ClusterData
apply(T1, T2) - Static method in class org.apache.spark.ml.feature.LabeledPoint
apply(int, int) - Method in class org.apache.spark.ml.linalg.DenseMatrix
apply(int) - Method in class org.apache.spark.ml.linalg.DenseVector
apply(int, int) - Method in interface org.apache.spark.ml.linalg.Matrix: Gets the (i, j)-th element.
apply(int, int) - Method in class org.apache.spark.ml.linalg.SparseMatrix
apply(int) - Method in class org.apache.spark.ml.linalg.SparseVector
apply(int) - Method in interface org.apache.spark.ml.linalg.Vector: Gets the value of the ith element.
apply(Param<T>) - Method in class org.apache.spark.ml.param.ParamMap: Gets the value of the input param or its default value if it does not exist.
apply(GeneralizedLinearRegressionBase) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.FamilyAndLink$: Constructs the FamilyAndLink object from a parameter map
apply(Split) - Method in class org.apache.spark.ml.tree.DecisionTreeModelReadWrite.SplitData$
apply(T1, T2, T3) - Static method in class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$.Data
apply(T1, T2, T3) - Static method in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV1_0$.Data
apply(T1, T2, T3, T4) - Static method in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV2_0$.Data
apply(BinaryConfusionMatrix) - Method in interface org.apache.spark.mllib.evaluation.binary.BinaryClassificationMetricComputer
apply(BinaryConfusionMatrix) - Static method in class org.apache.spark.mllib.evaluation.binary.FalsePositiveRate
apply(BinaryConfusionMatrix) - Static method in class org.apache.spark.mllib.evaluation.binary.Precision
apply(BinaryConfusionMatrix) - Static method in class org.apache.spark.mllib.evaluation.binary.Recall
apply(T1) - Static method in class org.apache.spark.mllib.feature.ChiSqSelectorModel.SaveLoadV1_0$.Data
apply(T1, T2, T3, T4, T5) - Static method in class org.apache.spark.mllib.feature.VocabWord
apply(int, int) - Method in class org.apache.spark.mllib.linalg.DenseMatrix
apply(int) - Method in class org.apache.spark.mllib.linalg.DenseVector
apply(T1, T2) - Static method in class org.apache.spark.mllib.linalg.distributed.IndexedRow
apply(T1, T2, T3) - Static method in class org.apache.spark.mllib.linalg.distributed.MatrixEntry
apply(int, int) - Method in interface org.apache.spark.mllib.linalg.Matrix: Gets the (i, j)-th element.
apply(int, int) - Method in class org.apache.spark.mllib.linalg.SparseMatrix
apply(int) - Method in class org.apache.spark.mllib.linalg.SparseVector
apply(int) - Method in interface org.apache.spark.mllib.linalg.Vector: Gets the value of the ith element.
apply(T1, T2, T3) - Static method in class org.apache.spark.mllib.recommendation.Rating
apply(T1, T2) - Static method in class org.apache.spark.mllib.regression.impl.GLMRegressionModel.SaveLoadV1_0$.Data
apply(T1, T2) - Static method in class org.apache.spark.mllib.stat.test.BinarySample
apply(int) - Static method in class org.apache.spark.mllib.tree.configuration.Algo
apply(int) - Static method in class org.apache.spark.mllib.tree.configuration.EnsembleCombiningStrategy
apply(int) - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
apply(int) - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
apply(int, Node) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData$
apply(Row) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData$
apply(int, Node) - Static method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
apply(Row) - Static method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
apply(Predict) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.PredictData$
apply(Row) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.PredictData$
apply(Predict) - Static method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.PredictData
apply(Row) - Static method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.PredictData
apply(Split) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData$
apply(Row) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData$
apply(Split) - Static method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData
apply(Row) - Static method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData
apply(int, Predict, double, boolean) - Static method in class org.apache.spark.mllib.tree.model.Node: Construct a node with nodeIndex, predict, impurity and isLeaf parameters.
apply(T1, T2, T3, T4) - Static method in class org.apache.spark.mllib.tree.model.Split
apply(int) - Static method in class org.apache.spark.rdd.CheckpointState
apply(int) - Static method in class org.apache.spark.rdd.DeterministicLevel
apply(long, String, Option<String>, String, boolean) - Static method in class org.apache.spark.scheduler.AccumulableInfo: Deprecated.
do not create AccumulableInfo. Since 2.0.0.
apply(long, String, Option<String>, String) - Static method in class org.apache.spark.scheduler.AccumulableInfo: Deprecated.
do not create AccumulableInfo. Since 2.0.0.
apply(long, String, String) - Static method in class org.apache.spark.scheduler.AccumulableInfo: Deprecated.
do not create AccumulableInfo. Since 2.0.0.
apply(T1, T2, T3, T4) - Static method in class org.apache.spark.scheduler.AskPermissionToCommitOutput
apply(T1, T2) - Static method in class org.apache.spark.scheduler.BlacklistedExecutor
apply(String, long, Enumeration.Value, ByteBuffer) - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate$: Alternate factory method that takes a ByteBuffer directly for the data field
apply(T1, T2, T3) - Static method in class org.apache.spark.scheduler.local.KillTask
apply() - Static method in class org.apache.spark.scheduler.local.ReviveOffers
apply(T1, T2, T3) - Static method in class org.apache.spark.scheduler.local.StatusUpdate
apply() - Static method in class org.apache.spark.scheduler.local.StopExecutor
apply(long, TaskMetrics) - Static method in class org.apache.spark.scheduler.RuntimePercentage
apply(int) - Static method in class org.apache.spark.scheduler.SchedulingMode
apply(T1) - Static method in class org.apache.spark.scheduler.SparkListenerApplicationEnd
apply(T1, T2, T3, T4, T5, T6) - Static method in class org.apache.spark.scheduler.SparkListenerApplicationStart
apply(T1, T2, T3, T4, T5) - Static method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
apply(T1, T2) - Static method in class org.apache.spark.scheduler.SparkListenerBlockManagerRemoved
apply(T1) - Static method in class org.apache.spark.scheduler.SparkListenerBlockUpdated
apply(T1) - Static method in class org.apache.spark.scheduler.SparkListenerEnvironmentUpdate
apply(T1, T2, T3) - Static method in class org.apache.spark.scheduler.SparkListenerExecutorAdded
apply(T1, T2, T3) - Static method in class org.apache.spark.scheduler.SparkListenerExecutorBlacklisted
apply(T1, T2, T3, T4, T5) - Static method in class org.apache.spark.scheduler.SparkListenerExecutorBlacklistedForStage
apply(T1, T2) - Static method in class org.apache.spark.scheduler.SparkListenerExecutorMetricsUpdate
apply(T1, T2, T3) - Static method in class org.apache.spark.scheduler.SparkListenerExecutorRemoved
apply(T1, T2) - Static method in class org.apache.spark.scheduler.SparkListenerExecutorUnblacklisted
apply(T1, T2, T3) - Static method in class org.apache.spark.scheduler.SparkListenerJobEnd
apply(T1, T2, T3, T4) - Static method in class org.apache.spark.scheduler.SparkListenerJobStart
apply(T1) - Static method in class org.apache.spark.scheduler.SparkListenerLogStart
apply(T1, T2, T3) - Static method in class org.apache.spark.scheduler.SparkListenerNodeBlacklisted
apply(T1, T2, T3, T4, T5) - Static method in class org.apache.spark.scheduler.SparkListenerNodeBlacklistedForStage
apply(T1, T2) - Static method in class org.apache.spark.scheduler.SparkListenerNodeUnblacklisted
apply(T1) - Static method in class org.apache.spark.scheduler.SparkListenerSpeculativeTaskSubmitted
apply(T1) - Static method in class org.apache.spark.scheduler.SparkListenerStageCompleted
apply(T1, T2) - Static method in class org.apache.spark.scheduler.SparkListenerStageSubmitted
apply(T1, T2, T3, T4, T5, T6) - Static method in class org.apache.spark.scheduler.SparkListenerTaskEnd
apply(T1) - Static method in class org.apache.spark.scheduler.SparkListenerTaskGettingResult
apply(T1, T2, T3) - Static method in class org.apache.spark.scheduler.SparkListenerTaskStart
apply(T1) - Static method in class org.apache.spark.scheduler.SparkListenerUnpersistRDD
apply(int) - Static method in class org.apache.spark.scheduler.TaskLocality
apply(Object) - Method in class org.apache.spark.sql.Column: Extracts a value or values from a complex type.
apply(String) - Method in class org.apache.spark.sql.Dataset: Selects column based on the column name and returns it as a Column.
apply(Column...) - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction: Creates a Column for this UDAF using given Columns as input arguments.
apply(Seq<Column>) - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction: Creates a Column for this UDAF using given Columns as input arguments.
apply(Column...) - Method in class org.apache.spark.sql.expressions.UserDefinedFunction: Returns an expression that invokes the UDF, using the given arguments.
apply(Seq<Column>) - Method in class org.apache.spark.sql.expressions.UserDefinedFunction: Returns an expression that invokes the UDF, using the given arguments.
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.DetermineTableStats
apply(T1, T2, T3, T4) - Static method in class org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand
apply(ScriptInputOutputSchema) - Static method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
apply(T1, T2, T3, T4, T5) - Static method in class org.apache.spark.sql.hive.execution.InsertIntoHiveDirCommand
apply(T1, T2, T3, T4, T5, T6) - Static method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
apply(T1, T2, T3, T4, T5) - Static method in class org.apache.spark.sql.hive.execution.ScriptTransformationExec
apply(LogicalPlan) - Static method in class org.apache.spark.sql.hive.HiveAnalysis
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveStrategies.HiveTableScans$
apply(LogicalPlan) - Static method in class org.apache.spark.sql.hive.HiveStrategies.HiveTableScans
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveStrategies.Scripts$
apply(LogicalPlan) - Static method in class org.apache.spark.sql.hive.HiveStrategies.Scripts
apply(T1, T2) - Static method in class org.apache.spark.sql.hive.HiveUDAFBuffer
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.RelationConversions
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.ResolveHiveSerdeTable
apply(T1, T2) - Static method in class org.apache.spark.sql.jdbc.JdbcType
apply(Dataset<Row>, Seq<Expression>, RelationalGroupedDataset.GroupType) - Static method in class org.apache.spark.sql.RelationalGroupedDataset
apply(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i.
apply(T1, T2) - Static method in class org.apache.spark.sql.sources.And
apply(T1, T2) - Static method in class org.apache.spark.sql.sources.EqualNullSafe
apply(T1, T2) - Static method in class org.apache.spark.sql.sources.EqualTo
apply(T1, T2) - Static method in class org.apache.spark.sql.sources.GreaterThan
apply(T1, T2) - Static method in class org.apache.spark.sql.sources.GreaterThanOrEqual
apply(T1, T2) - Static method in class org.apache.spark.sql.sources.In
apply(T1) - Static method in class org.apache.spark.sql.sources.IsNotNull
apply(T1) - Static method in class org.apache.spark.sql.sources.IsNull
apply(T1, T2) - Static method in class org.apache.spark.sql.sources.LessThan
apply(T1, T2) - Static method in class org.apache.spark.sql.sources.LessThanOrEqual
apply(T1) - Static method in class org.apache.spark.sql.sources.Not
apply(T1, T2) - Static method in class org.apache.spark.sql.sources.Or
apply(T1, T2) - Static method in class org.apache.spark.sql.sources.StringContains
apply(T1, T2) - Static method in class org.apache.spark.sql.sources.StringEndsWith
apply(T1, T2) - Static method in class org.apache.spark.sql.sources.StringStartsWith
apply(String) - Static method in class org.apache.spark.sql.streaming.ProcessingTime: Deprecated.
use Trigger.ProcessingTime(interval)
apply(Duration) - Static method in class org.apache.spark.sql.streaming.ProcessingTime: Deprecated.
use Trigger.ProcessingTime(interval)
apply(DataType) - Static method in class org.apache.spark.sql.types.ArrayType: Construct a ArrayType object with the given element type.
apply(T1) - Static method in class org.apache.spark.sql.types.CharType
apply(double) - Static method in class org.apache.spark.sql.types.Decimal
apply(long) - Static method in class org.apache.spark.sql.types.Decimal
apply(int) - Static method in class org.apache.spark.sql.types.Decimal
apply(BigDecimal) - Static method in class org.apache.spark.sql.types.Decimal
apply(BigDecimal) - Static method in class org.apache.spark.sql.types.Decimal
apply(BigInteger) - Static method in class org.apache.spark.sql.types.Decimal
apply(BigInt) - Static method in class org.apache.spark.sql.types.Decimal
apply(BigDecimal, int, int) - Static method in class org.apache.spark.sql.types.Decimal
apply(BigDecimal, int, int) - Static method in class org.apache.spark.sql.types.Decimal
apply(long, int, int) - Static method in class org.apache.spark.sql.types.Decimal
apply(String) - Static method in class org.apache.spark.sql.types.Decimal
apply(DataType, DataType) - Static method in class org.apache.spark.sql.types.MapType: Construct a MapType object with the given key type and value type.
apply(T1, T2, T3, T4) - Static method in class org.apache.spark.sql.types.StructField
apply(String) - Method in class org.apache.spark.sql.types.StructType: Extracts the StructField with the given name.
apply(Set<String>) - Method in class org.apache.spark.sql.types.StructType: Returns a StructType containing StructFields of the given names, preserving the original order of fields.
apply(int) - Method in class org.apache.spark.sql.types.StructType
apply(T1) - Static method in class org.apache.spark.sql.types.VarcharType
apply(T1, T2, T3, T4, T5, T6, T7, T8) - Static method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
apply(T1, T2, T3, T4, T5, T6, T7) - Static method in class org.apache.spark.status.api.v1.ApplicationInfo
apply(T1) - Static method in class org.apache.spark.status.api.v1.StackTrace
apply(T1, T2, T3, T4, T5, T6, T7) - Static method in class org.apache.spark.status.api.v1.ThreadStackTrace
apply(int) - Method in class org.apache.spark.status.RDDPartitionSeq
apply(String) - Static method in class org.apache.spark.storage.BlockId
apply(String, String, int, Option<String>) - Static method in class org.apache.spark.storage.BlockManagerId: Returns a BlockManagerId for the given configuration.
apply(ObjectInput) - Static method in class org.apache.spark.storage.BlockManagerId
apply(T1, T2) - Static method in class org.apache.spark.storage.BroadcastBlockId
apply(T1, T2) - Static method in class org.apache.spark.storage.RDDBlockId
apply(T1, T2, T3) - Static method in class org.apache.spark.storage.ShuffleBlockId
apply(T1, T2, T3) - Static method in class org.apache.spark.storage.ShuffleDataBlockId
apply(T1, T2, T3) - Static method in class org.apache.spark.storage.ShuffleIndexBlockId
apply(boolean, boolean, boolean, boolean, int) - Static method in class org.apache.spark.storage.StorageLevel: :: DeveloperApi :: Create a new StorageLevel object.
apply(boolean, boolean, boolean, int) - Static method in class org.apache.spark.storage.StorageLevel: :: DeveloperApi :: Create a new StorageLevel object without setting useOffHeap.
apply(int, int) - Static method in class org.apache.spark.storage.StorageLevel: :: DeveloperApi :: Create a new StorageLevel object from its integer representation.
apply(ObjectInput) - Static method in class org.apache.spark.storage.StorageLevel: :: DeveloperApi :: Read StorageLevel object from ObjectInput stream.
apply(T1, T2) - Static method in class org.apache.spark.storage.StreamBlockId
apply(T1) - Static method in class org.apache.spark.storage.TaskResultBlockId
apply(T1) - Static method in class org.apache.spark.streaming.Duration
apply(long) - Static method in class org.apache.spark.streaming.Milliseconds
apply(long) - Static method in class org.apache.spark.streaming.Minutes
apply(T1, T2, T3, T4, T5, T6) - Static method in class org.apache.spark.streaming.scheduler.BatchInfo
apply(T1, T2, T3, T4, T5, T6, T7) - Static method in class org.apache.spark.streaming.scheduler.OutputOperationInfo
apply(T1, T2, T3, T4, T5, T6, T7, T8) - Static method in class org.apache.spark.streaming.scheduler.ReceiverInfo
apply(int) - Static method in class org.apache.spark.streaming.scheduler.ReceiverState
apply(T1) - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchCompleted
apply(T1) - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchStarted
apply(T1) - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchSubmitted
apply(T1) - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerOutputOperationCompleted
apply(T1) - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerOutputOperationStarted
apply(T1) - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverError
apply(T1) - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStarted
apply(T1) - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStopped
apply(T1) - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerStreamingStarted
apply(long) - Static method in class org.apache.spark.streaming.Seconds
apply(T1, T2, T3) - Static method in class org.apache.spark.TaskCommitDenied
apply(T1, T2, T3) - Static method in class org.apache.spark.TaskKilled
apply(int) - Static method in class org.apache.spark.TaskState
apply(TraversableOnce<Object>) - Static method in class org.apache.spark.util.StatCounter: Build a StatCounter from a list of values.
apply(Seq<Object>) - Static method in class org.apache.spark.util.StatCounter: Build a StatCounter from a list of values passed as variable-length arguments.
ApplyInPlace - Class in org.apache.spark.ml.ann: Implements in-place application of functions in the arrays
ApplyInPlace() - Constructor for class org.apache.spark.ml.ann.ApplyInPlace
applySchema(RDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
Use createDataFrame instead. Since 1.3.0.
applySchema(JavaRDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
Use createDataFrame instead. Since 1.3.0.
applySchema(RDD<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
Use createDataFrame instead. Since 1.3.0.
applySchema(JavaRDD<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
Use createDataFrame instead. Since 1.3.0.
appName() - Method in class org.apache.spark.api.java.JavaSparkContext
appName() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
appName() - Method in class org.apache.spark.SparkContext
appName(String) - Method in class org.apache.spark.sql.SparkSession.Builder: Sets a name for the application, which will be shown in the Spark web UI.
approx_count_distinct(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the approximate number of distinct items in a group.
approx_count_distinct(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the approximate number of distinct items in a group.
approx_count_distinct(Column, double) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the approximate number of distinct items in a group.
approx_count_distinct(String, double) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the approximate number of distinct items in a group.
approxCountDistinct(Column) - Static method in class org.apache.spark.sql.functions: Deprecated.
Use approx_count_distinct. Since 2.1.0.
approxCountDistinct(String) - Static method in class org.apache.spark.sql.functions: Deprecated.
Use approx_count_distinct. Since 2.1.0.
approxCountDistinct(Column, double) - Static method in class org.apache.spark.sql.functions: Deprecated.
Use approx_count_distinct. Since 2.1.0.
approxCountDistinct(String, double) - Static method in class org.apache.spark.sql.functions: Deprecated.
Use approx_count_distinct. Since 2.1.0.
ApproxHist() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
ApproximateEvaluator<U,R> - Interface in org.apache.spark.partial: An object that computes a function incrementally by merging in results of type U from multiple tasks.
approxQuantile(String, double[], double) - Method in class org.apache.spark.sql.DataFrameStatFunctions: Calculates the approximate quantiles of a numerical column of a DataFrame.
approxQuantile(String[], double[], double) - Method in class org.apache.spark.sql.DataFrameStatFunctions: Calculates the approximate quantiles of numerical columns of a DataFrame.
appSparkVersion() - Method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
AppStatusUtils - Class in org.apache.spark.status
AppStatusUtils() - Constructor for class org.apache.spark.status.AppStatusUtils
AreaUnderCurve - Class in org.apache.spark.mllib.evaluation: Computes the area under the curve (AUC) using the trapezoidal rule.
AreaUnderCurve() - Constructor for class org.apache.spark.mllib.evaluation.AreaUnderCurve
areaUnderPR() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Computes the area under the precision-recall curve.
areaUnderROC() - Method in interface org.apache.spark.ml.classification.BinaryLogisticRegressionSummary: Computes the area under the receiver operating characteristic (ROC) curve.
areaUnderROC() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Computes the area under the receiver operating characteristic (ROC) curve.
argmax() - Method in class org.apache.spark.ml.linalg.DenseVector
argmax() - Method in class org.apache.spark.ml.linalg.SparseVector
argmax() - Method in interface org.apache.spark.ml.linalg.Vector: Find the index of a maximal element.
argmax() - Method in class org.apache.spark.mllib.linalg.DenseVector
argmax() - Method in class org.apache.spark.mllib.linalg.SparseVector
argmax() - Method in interface org.apache.spark.mllib.linalg.Vector: Find the index of a maximal element.
argString() - Method in class org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand
array(DataType) - Method in class org.apache.spark.sql.ColumnName: Creates a new StructField of type array.
array(Column...) - Static method in class org.apache.spark.sql.functions: Creates a new array column.
array(String, String...) - Static method in class org.apache.spark.sql.functions: Creates a new array column.
array(Seq<Column>) - Static method in class org.apache.spark.sql.functions: Creates a new array column.
array(String, Seq<String>) - Static method in class org.apache.spark.sql.functions: Creates a new array column.
array() - Method in class org.apache.spark.sql.vectorized.ColumnarArray
array_contains(Column, Object) - Static method in class org.apache.spark.sql.functions: Returns null if the array is null, true if the array contains value, and false otherwise.
array_distinct(Column) - Static method in class org.apache.spark.sql.functions: Removes duplicate values from the array.
array_except(Column, Column) - Static method in class org.apache.spark.sql.functions: Returns an array of the elements in the first array but not in the second array, without duplicates.
array_intersect(Column, Column) - Static method in class org.apache.spark.sql.functions: Returns an array of the elements in the intersection of the given two arrays, without duplicates.
array_join(Column, String, String) - Static method in class org.apache.spark.sql.functions: Concatenates the elements of column using the delimiter.
array_join(Column, String) - Static method in class org.apache.spark.sql.functions: Concatenates the elements of column using the delimiter.
array_max(Column) - Static method in class org.apache.spark.sql.functions: Returns the maximum value in the array.
array_min(Column) - Static method in class org.apache.spark.sql.functions: Returns the minimum value in the array.
array_position(Column, Object) - Static method in class org.apache.spark.sql.functions: Locates the position of the first occurrence of the value in the given array as long.
array_remove(Column, Object) - Static method in class org.apache.spark.sql.functions: Remove all elements that equal to element from the given array.
array_repeat(Column, Column) - Static method in class org.apache.spark.sql.functions: Creates an array containing the left argument repeated the number of times given by the right argument.
array_repeat(Column, int) - Static method in class org.apache.spark.sql.functions: Creates an array containing the left argument repeated the number of times given by the right argument.
array_sort(Column) - Static method in class org.apache.spark.sql.functions: Sorts the input array in ascending order.
array_union(Column, Column) - Static method in class org.apache.spark.sql.functions: Returns an array of the elements in the union of the given two arrays, without duplicates.
arrayLengthGt(double) - Static method in class org.apache.spark.ml.param.ParamValidators: Check that the array length is greater than lowerBound.
arrays_overlap(Column, Column) - Static method in class org.apache.spark.sql.functions: Returns true if a1 and a2 have at least one non-null element in common.
arrays_zip(Column...) - Static method in class org.apache.spark.sql.functions: Returns a merged array of structs in which the N-th struct contains all N-th values of input arrays.
arrays_zip(Seq<Column>) - Static method in class org.apache.spark.sql.functions: Returns a merged array of structs in which the N-th struct contains all N-th values of input arrays.
ArrayType - Class in org.apache.spark.sql.types
ArrayType(DataType, boolean) - Constructor for class org.apache.spark.sql.types.ArrayType
arrayValues() - Method in class org.apache.spark.storage.memory.DeserializedValuesHolder
ArrowColumnVector - Class in org.apache.spark.sql.vectorized: A column vector backed by Apache Arrow.
ArrowColumnVector(ValueVector) - Constructor for class org.apache.spark.sql.vectorized.ArrowColumnVector
as(Encoder) - Method in class org.apache.spark.sql.Column: Provides a type hint about the expected return value of this column.
as(String) - Method in class org.apache.spark.sql.Column: Gives the column an alias.
as(Seq<String>) - Method in class org.apache.spark.sql.Column: (Scala-specific) Assigns the given aliases to the results of a table generating function.
as(String[]) - Method in class org.apache.spark.sql.Column: Assigns the given aliases to the results of a table generating function.
as(Symbol) - Method in class org.apache.spark.sql.Column: Gives the column an alias.
as(String, Metadata) - Method in class org.apache.spark.sql.Column: Gives the column an alias with metadata.
as(Encoder) - Method in class org.apache.spark.sql.Dataset: :: Experimental :: Returns a new Dataset where each record has been mapped on to the specified type.
as(String) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset with an alias set.
as(Symbol) - Method in class org.apache.spark.sql.Dataset: (Scala-specific) Returns a new Dataset with an alias set.
asBinary() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary: Convenient method for casting to binary logistic regression summary.
asBreeze() - Method in interface org.apache.spark.ml.linalg.Matrix: Converts to a breeze matrix.
asBreeze() - Method in interface org.apache.spark.ml.linalg.Vector: Converts the instance to a breeze vector.
asBreeze() - Method in interface org.apache.spark.mllib.linalg.Matrix: Converts to a breeze matrix.
asBreeze() - Method in interface org.apache.spark.mllib.linalg.Vector: Converts the instance to a breeze vector.
asc() - Method in class org.apache.spark.sql.Column: Returns a sort expression based on ascending order of the column.
asc(String) - Static method in class org.apache.spark.sql.functions: Returns a sort expression based on ascending order of the column.
asc_nulls_first() - Method in class org.apache.spark.sql.Column: Returns a sort expression based on ascending order of the column, and null values return before non-null values.
asc_nulls_first(String) - Static method in class org.apache.spark.sql.functions: Returns a sort expression based on ascending order of the column, and null values return before non-null values.
asc_nulls_last() - Method in class org.apache.spark.sql.Column: Returns a sort expression based on ascending order of the column, and null values appear after non-null values.
asc_nulls_last(String) - Static method in class org.apache.spark.sql.functions: Returns a sort expression based on ascending order of the column, and null values appear after non-null values.
ascii(Column) - Static method in class org.apache.spark.sql.functions: Computes the numeric value of the first character of the string column, and returns the result as an int column.
asin(Column) - Static method in class org.apache.spark.sql.functions
asin(String) - Static method in class org.apache.spark.sql.functions
asIterator() - Method in class org.apache.spark.serializer.DeserializationStream: Read the elements of this stream through an iterator.
asJavaPairRDD() - Method in class org.apache.spark.api.r.PairwiseRRDD
asJavaRDD() - Method in class org.apache.spark.api.r.RRDD
asJavaRDD() - Method in class org.apache.spark.api.r.StringRRDD
asKeyValueIterator() - Method in class org.apache.spark.serializer.DeserializationStream: Read the elements of this stream through an iterator over key-value pairs.
AskPermissionToCommitOutput - Class in org.apache.spark.scheduler
AskPermissionToCommitOutput(int, int, int, int) - Constructor for class org.apache.spark.scheduler.AskPermissionToCommitOutput
askRpcTimeout(SparkConf) - Static method in class org.apache.spark.util.RpcUtils: Returns the default Spark timeout to use for RPC ask operations.
askSlaves() - Method in class org.apache.spark.storage.BlockManagerMessages.GetBlockStatus
askSlaves() - Method in class org.apache.spark.storage.BlockManagerMessages.GetMatchingBlockIds
asMap() - Method in class org.apache.spark.sql.sources.v2.DataSourceOptions
asML() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
asML() - Method in class org.apache.spark.mllib.linalg.DenseVector
asML() - Method in interface org.apache.spark.mllib.linalg.Matrix: Convert this matrix to the new mllib-local representation.
asML() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
asML() - Method in class org.apache.spark.mllib.linalg.SparseVector
asML() - Method in interface org.apache.spark.mllib.linalg.Vector: Convert this vector to the new mllib-local representation.
asNondeterministic() - Method in class org.apache.spark.sql.expressions.UserDefinedFunction: Updates UserDefinedFunction to nondeterministic.
asNonNullable() - Method in class org.apache.spark.sql.expressions.UserDefinedFunction: Updates UserDefinedFunction to non-nullable.
asNullable() - Method in class org.apache.spark.sql.types.ObjectType
asRDDId() - Method in class org.apache.spark.storage.BlockId
assertConf(JobContext, SparkConf) - Method in class org.apache.spark.internal.io.HadoopWriteConfigUtil
assertNotSpilled(SparkContext, String, Function0<BoxedUnit>) - Static method in class org.apache.spark.TestUtils: Run some code involving jobs submitted to the given context and assert that the jobs did not spill.
assertSpilled(SparkContext, String, Function0<BoxedUnit>) - Static method in class org.apache.spark.TestUtils: Run some code involving jobs submitted to the given context and assert that the jobs spilled.
assignClusters(Dataset<?>) - Method in class org.apache.spark.ml.clustering.PowerIterationClustering: Run the PIC algorithm and returns a cluster assignment for each input vertex.
Assignment(long, int) - Constructor for class org.apache.spark.mllib.clustering.PowerIterationClustering.Assignment
Assignment$() - Constructor for class org.apache.spark.mllib.clustering.PowerIterationClustering.Assignment$
assignments() - Method in class org.apache.spark.mllib.clustering.PowerIterationClusteringModel
AssociationRules - Class in org.apache.spark.ml.fpm
AssociationRules() - Constructor for class org.apache.spark.ml.fpm.AssociationRules
associationRules() - Method in class org.apache.spark.ml.fpm.FPGrowthModel: Get association rules fitted using the minConfidence.
AssociationRules - Class in org.apache.spark.mllib.fpm: Generates association rules from a RDD[FreqItemset[Item}.
AssociationRules() - Constructor for class org.apache.spark.mllib.fpm.AssociationRules: Constructs a default instance with default parameters {minConfidence = 0.8}.
AssociationRules.Rule<Item> - Class in org.apache.spark.mllib.fpm: An association rule between sets of items.
ASYNC_TRACKING_ENABLED() - Static method in class org.apache.spark.status.config
AsyncEventQueue - Class in org.apache.spark.scheduler: An asynchronous queue for events.
AsyncEventQueue(String, SparkConf, LiveListenerBusMetrics, LiveListenerBus) - Constructor for class org.apache.spark.scheduler.AsyncEventQueue
AsyncRDDActions<T> - Class in org.apache.spark.rdd: A set of asynchronous RDD actions available through an implicit conversion.
AsyncRDDActions(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.AsyncRDDActions
atan(Column) - Static method in class org.apache.spark.sql.functions
atan(String) - Static method in class org.apache.spark.sql.functions
atan2(Column, Column) - Static method in class org.apache.spark.sql.functions
atan2(Column, String) - Static method in class org.apache.spark.sql.functions
atan2(String, Column) - Static method in class org.apache.spark.sql.functions
atan2(String, String) - Static method in class org.apache.spark.sql.functions
atan2(Column, double) - Static method in class org.apache.spark.sql.functions
atan2(String, double) - Static method in class org.apache.spark.sql.functions
atan2(double, Column) - Static method in class org.apache.spark.sql.functions
atan2(double, String) - Static method in class org.apache.spark.sql.functions
attempt() - Method in class org.apache.spark.status.api.v1.TaskData
ATTEMPT() - Static method in class org.apache.spark.status.TaskIndexNames
attemptId() - Method in class org.apache.spark.scheduler.StageInfo: Deprecated.
Use attemptNumber instead. Since 2.3.0.
attemptId() - Method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
attemptId() - Method in interface org.apache.spark.status.api.v1.BaseAppResource
attemptId() - Method in class org.apache.spark.status.api.v1.StageData
attemptNumber() - Method in class org.apache.spark.BarrierTaskContext
attemptNumber() - Method in class org.apache.spark.scheduler.AskPermissionToCommitOutput
attemptNumber() - Method in class org.apache.spark.scheduler.StageInfo
attemptNumber() - Method in class org.apache.spark.scheduler.TaskInfo
attemptNumber() - Method in class org.apache.spark.TaskCommitDenied
attemptNumber() - Method in class org.apache.spark.TaskContext: How many times this task has been attempted.
attempts() - Method in class org.apache.spark.status.api.v1.ApplicationInfo
AtTimestamp(Date) - Constructor for class org.apache.spark.streaming.kinesis.KinesisInitialPositions.AtTimestamp
attr() - Method in class org.apache.spark.graphx.Edge
attr() - Method in class org.apache.spark.graphx.EdgeContext: The attribute associated with the edge.
attr() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
Attribute - Class in org.apache.spark.ml.attribute: :: DeveloperApi :: Abstract class for ML attributes.
Attribute() - Constructor for class org.apache.spark.ml.attribute.Attribute
attribute() - Method in class org.apache.spark.sql.sources.EqualNullSafe
attribute() - Method in class org.apache.spark.sql.sources.EqualTo
attribute() - Method in class org.apache.spark.sql.sources.GreaterThan
attribute() - Method in class org.apache.spark.sql.sources.GreaterThanOrEqual
attribute() - Method in class org.apache.spark.sql.sources.In
attribute() - Method in class org.apache.spark.sql.sources.IsNotNull
attribute() - Method in class org.apache.spark.sql.sources.IsNull
attribute() - Method in class org.apache.spark.sql.sources.LessThan
attribute() - Method in class org.apache.spark.sql.sources.LessThanOrEqual
attribute() - Method in class org.apache.spark.sql.sources.StringContains
attribute() - Method in class org.apache.spark.sql.sources.StringEndsWith
attribute() - Method in class org.apache.spark.sql.sources.StringStartsWith
AttributeFactory - Interface in org.apache.spark.ml.attribute: Trait for ML attribute factories.
AttributeGroup - Class in org.apache.spark.ml.attribute: :: DeveloperApi :: Attributes that describe a vector ML column.
AttributeGroup(String) - Constructor for class org.apache.spark.ml.attribute.AttributeGroup: Creates an attribute group without attribute info.
AttributeGroup(String, int) - Constructor for class org.apache.spark.ml.attribute.AttributeGroup: Creates an attribute group knowing only the number of attributes.
AttributeGroup(String, Attribute[]) - Constructor for class org.apache.spark.ml.attribute.AttributeGroup: Creates an attribute group with attributes.
AttributeKeys - Class in org.apache.spark.ml.attribute: Keys used to store attributes.
AttributeKeys() - Constructor for class org.apache.spark.ml.attribute.AttributeKeys
attributes() - Method in class org.apache.spark.ml.attribute.AttributeGroup: Optional array of attributes.
ATTRIBUTES() - Static method in class org.apache.spark.ml.attribute.AttributeKeys
AttributeType - Class in org.apache.spark.ml.attribute: :: DeveloperApi :: An enum-like type for attribute types: AttributeType$.Numeric, AttributeType$.Nominal, and AttributeType$.Binary.
AttributeType(String) - Constructor for class org.apache.spark.ml.attribute.AttributeType
attrType() - Method in class org.apache.spark.ml.attribute.Attribute: Attribute type.
attrType() - Method in class org.apache.spark.ml.attribute.BinaryAttribute
attrType() - Method in class org.apache.spark.ml.attribute.NominalAttribute
attrType() - Method in class org.apache.spark.ml.attribute.NumericAttribute
attrType() - Static method in class org.apache.spark.ml.attribute.UnresolvedAttribute
available() - Method in class org.apache.spark.io.NioBufferedFileInputStream
available() - Method in class org.apache.spark.io.ReadAheadInputStream
available() - Method in class org.apache.spark.storage.BufferReleasingInputStream
Average() - Static method in class org.apache.spark.mllib.tree.configuration.EnsembleCombiningStrategy
avg(MapFunction<T, Double>) - Static method in class org.apache.spark.sql.expressions.javalang.typed: Average aggregate function.
avg(Function1<IN, Object>) - Static method in class org.apache.spark.sql.expressions.scalalang.typed: Average aggregate function.
avg(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the average of the values in a group.
avg(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the average of the values in a group.
avg(String...) - Method in class org.apache.spark.sql.RelationalGroupedDataset: Compute the mean value for each numeric columns for each group.
avg(Seq<String>) - Method in class org.apache.spark.sql.RelationalGroupedDataset: Compute the mean value for each numeric columns for each group.
avg() - Method in class org.apache.spark.util.DoubleAccumulator: Returns the average of elements added to the accumulator.
avg() - Method in class org.apache.spark.util.LongAccumulator: Returns the average of elements added to the accumulator.
avgEventRate() - Method in class org.apache.spark.status.api.v1.streaming.ReceiverInfo
avgInputRate() - Method in class org.apache.spark.status.api.v1.streaming.StreamingStatistics
avgMetrics() - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
avgProcessingTime() - Method in class org.apache.spark.status.api.v1.streaming.StreamingStatistics
avgSchedulingDelay() - Method in class org.apache.spark.status.api.v1.streaming.StreamingStatistics
avgTotalDelay() - Method in class org.apache.spark.status.api.v1.streaming.StreamingStatistics
awaitAnyTermination() - Method in class org.apache.spark.sql.streaming.StreamingQueryManager: Wait until any of the queries on the associated SQLContext has terminated since the creation of the context, or since resetTerminated() was called.
awaitAnyTermination(long) - Method in class org.apache.spark.sql.streaming.StreamingQueryManager: Wait until any of the queries on the associated SQLContext has terminated since the creation of the context, or since resetTerminated() was called.
awaitReady(Awaitable<T>, Duration) - Static method in class org.apache.spark.util.ThreadUtils: Preferred alternative to Await.ready().
awaitResult(Awaitable<T>, Duration) - Static method in class org.apache.spark.util.ThreadUtils: Preferred alternative to Await.result().
awaitTermination() - Method in interface org.apache.spark.sql.streaming.StreamingQuery: Waits for the termination of this query, either by query.stop() or by an exception.
awaitTermination(long) - Method in interface org.apache.spark.sql.streaming.StreamingQuery: Waits for the termination of this query, either by query.stop() or by an exception.
awaitTermination() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Wait for the execution to stop.
awaitTermination() - Method in class org.apache.spark.streaming.StreamingContext: Wait for the execution to stop.
awaitTerminationOrTimeout(long) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Wait for the execution to stop.
awaitTerminationOrTimeout(long) - Method in class org.apache.spark.streaming.StreamingContext: Wait for the execution to stop.
axpy(double, Vector, Vector) - Static method in class org.apache.spark.ml.linalg.BLAS: y += a * x
axpy(double, Vector, Vector) - Static method in class org.apache.spark.mllib.linalg.BLAS: y += a * x

B

BACKUP_STANDALONE_MASTER_PREFIX() - Static method in class org.apache.spark.util.Utils: An identifier that backup masters use in their responses.
balanceSlack() - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer
barrier() - Method in class org.apache.spark.BarrierTaskContext: :: Experimental :: Sets a global barrier and waits until all tasks in this stage hit this barrier.
barrier() - Method in class org.apache.spark.rdd.RDD: :: Experimental :: Marks the current stage as a barrier stage, where Spark must launch all tasks together.
BarrierCoordinatorMessage - Interface in org.apache.spark
BarrierTaskContext - Class in org.apache.spark: :: Experimental :: A TaskContext with extra contextual info and tooling for tasks in a barrier stage.
BarrierTaskInfo - Class in org.apache.spark: :: Experimental :: Carries all task infos of a barrier task.
base64(Column) - Static method in class org.apache.spark.sql.functions: Computes the BASE64 encoding of a binary column and returns it as a string column.
BaseAppResource - Interface in org.apache.spark.status.api.v1: Base class for resource handlers that use app-specific data.
baseOn(ParamPair<?>...) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder: Sets the given parameters in this grid to fixed values.
baseOn(ParamMap) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder: Sets the given parameters in this grid to fixed values.
baseOn(Seq<ParamPair<?>>) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder: Sets the given parameters in this grid to fixed values.
BaseReadWrite - Interface in org.apache.spark.ml.util: Trait for MLWriter and MLReader.
BaseRelation - Class in org.apache.spark.sql.sources: Represents a collection of tuples with a known schema.
BaseRelation() - Constructor for class org.apache.spark.sql.sources.BaseRelation
baseRelationToDataFrame(BaseRelation) - Method in class org.apache.spark.sql.SparkSession: Convert a BaseRelation created for external data sources into a DataFrame.
baseRelationToDataFrame(BaseRelation) - Method in class org.apache.spark.sql.SQLContext
BaseRRDD<T,U> - Class in org.apache.spark.api.r
BaseRRDD(RDD<T>, int, byte[], String, String, byte[], Broadcast<Object>[], ClassTag<T>, ClassTag) - Constructor for class org.apache.spark.api.r.BaseRRDD
BaseStreamingAppResource - Interface in org.apache.spark.status.api.v1.streaming: Base class for streaming API handlers, provides easy access to the streaming listener that holds the app's information.
BasicBlockReplicationPolicy - Class in org.apache.spark.storage
BasicBlockReplicationPolicy() - Constructor for class org.apache.spark.storage.BasicBlockReplicationPolicy
basicCredentials(String, String) - Method in class org.apache.spark.streaming.kinesis.SparkAWSCredentials.Builder: Use a basic AWS keypair for long-lived authorization.
basicSparkPage(HttpServletRequest, Function0<Seq<Node>>, String, boolean) - Static method in class org.apache.spark.ui.UIUtils: Returns a page with the spark css/js and a simple format.
batchDuration() - Method in class org.apache.spark.status.api.v1.streaming.BatchInfo
batchDuration() - Method in class org.apache.spark.status.api.v1.streaming.StreamingStatistics
BATCHES() - Static method in class org.apache.spark.mllib.clustering.StreamingKMeans
batchId() - Method in class org.apache.spark.sql.streaming.StreamingQueryProgress
batchId() - Method in class org.apache.spark.status.api.v1.streaming.BatchInfo
BatchInfo - Class in org.apache.spark.status.api.v1.streaming
BatchInfo - Class in org.apache.spark.streaming.scheduler: :: DeveloperApi :: Class having information on completed batches.
BatchInfo(Time, Map<Object, StreamInputInfo>, long, Option<Object>, Option<Object>, Map<Object, OutputOperationInfo>) - Constructor for class org.apache.spark.streaming.scheduler.BatchInfo
batchInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchCompleted
batchInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchStarted
batchInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchSubmitted
batchInfos() - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
BatchStatus - Enum in org.apache.spark.status.api.v1.streaming
batchTime() - Method in class org.apache.spark.status.api.v1.streaming.BatchInfo
batchTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
batchTime() - Method in class org.apache.spark.streaming.scheduler.OutputOperationInfo
bbos() - Method in class org.apache.spark.storage.memory.SerializedValuesHolder
bean(Class<T>) - Static method in class org.apache.spark.sql.Encoders: Creates an encoder for Java Bean of type T.
beforeFetch(Connection, Map<String, String>) - Static method in class org.apache.spark.sql.jdbc.DB2Dialect
beforeFetch(Connection, Map<String, String>) - Static method in class org.apache.spark.sql.jdbc.DerbyDialect
beforeFetch(Connection, Map<String, String>) - Method in class org.apache.spark.sql.jdbc.JdbcDialect: Override connection specific properties to run before a select is made.
beforeFetch(Connection, Map<String, String>) - Static method in class org.apache.spark.sql.jdbc.MsSqlServerDialect
beforeFetch(Connection, Map<String, String>) - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
beforeFetch(Connection, Map<String, String>) - Static method in class org.apache.spark.sql.jdbc.NoopDialect
beforeFetch(Connection, Map<String, String>) - Static method in class org.apache.spark.sql.jdbc.OracleDialect
beforeFetch(Connection, Map<String, String>) - Static method in class org.apache.spark.sql.jdbc.PostgresDialect
beforeFetch(Connection, Map<String, String>) - Static method in class org.apache.spark.sql.jdbc.TeradataDialect
BernoulliCellSampler<T> - Class in org.apache.spark.util.random: :: DeveloperApi :: A sampler based on Bernoulli trials for partitioning a data sequence.
BernoulliCellSampler(double, double, boolean) - Constructor for class org.apache.spark.util.random.BernoulliCellSampler
BernoulliSampler<T> - Class in org.apache.spark.util.random: :: DeveloperApi :: A sampler based on Bernoulli trials.
BernoulliSampler(double, ClassTag<T>) - Constructor for class org.apache.spark.util.random.BernoulliSampler
bestModel() - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
bestModel() - Method in class org.apache.spark.ml.tuning.TrainValidationSplitModel
beta() - Method in class org.apache.spark.mllib.random.WeibullGenerator
between(Object, Object) - Method in class org.apache.spark.sql.Column: True if the current column is between the lower bound and upper bound, inclusive.
bin(Column) - Static method in class org.apache.spark.sql.functions: An expression that returns the string representation of the binary value of the given long column.
bin(String) - Static method in class org.apache.spark.sql.functions: An expression that returns the string representation of the binary value of the given long column.
Binarizer - Class in org.apache.spark.ml.feature: Binarize a column of continuous features given a threshold.
Binarizer(String) - Constructor for class org.apache.spark.ml.feature.Binarizer
Binarizer() - Constructor for class org.apache.spark.ml.feature.Binarizer
Binary() - Static method in class org.apache.spark.ml.attribute.AttributeType: Binary type.
binary() - Method in interface org.apache.spark.ml.feature.CountVectorizerParams: Binary toggle to control the output vector values.
binary() - Method in class org.apache.spark.ml.feature.HashingTF: Binary toggle to control term frequency counts.
binary() - Method in class org.apache.spark.sql.ColumnName: Creates a new StructField of type binary.
BINARY() - Static method in class org.apache.spark.sql.Encoders: An encoder for arrays of bytes.
BinaryAttribute - Class in org.apache.spark.ml.attribute: :: DeveloperApi :: A binary attribute.
BinaryClassificationEvaluator - Class in org.apache.spark.ml.evaluation: :: Experimental :: Evaluator for binary classification, which expects two input columns: rawPrediction and label.
BinaryClassificationEvaluator(String) - Constructor for class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
BinaryClassificationEvaluator() - Constructor for class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
BinaryClassificationMetricComputer - Interface in org.apache.spark.mllib.evaluation.binary: Trait for a binary classification evaluation metric computer.
BinaryClassificationMetrics - Class in org.apache.spark.mllib.evaluation: Evaluator for binary classification.
BinaryClassificationMetrics(RDD<Tuple2<Object, Object>>, int) - Constructor for class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
BinaryClassificationMetrics(RDD<Tuple2<Object, Object>>) - Constructor for class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Defaults numBins to 0.
BinaryConfusionMatrix - Interface in org.apache.spark.mllib.evaluation.binary: Trait for a binary confusion matrix.
binaryFiles(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Read a directory of binary files from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI as a byte array.
binaryFiles(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Read a directory of binary files from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI as a byte array.
binaryFiles(String, int) - Method in class org.apache.spark.SparkContext: Get an RDD for a Hadoop-readable dataset as PortableDataStream for each file (useful for binary data)
binaryLabelValidator() - Static method in class org.apache.spark.mllib.util.DataValidators: Function to check if labels used for classification are either zero or one.
BinaryLogisticRegressionSummary - Interface in org.apache.spark.ml.classification: :: Experimental :: Abstraction for binary logistic regression results for a given model.
BinaryLogisticRegressionSummaryImpl - Class in org.apache.spark.ml.classification: Binary logistic regression results for a given model.
BinaryLogisticRegressionSummaryImpl(Dataset<Row>, String, String, String, String) - Constructor for class org.apache.spark.ml.classification.BinaryLogisticRegressionSummaryImpl
BinaryLogisticRegressionTrainingSummary - Interface in org.apache.spark.ml.classification: :: Experimental :: Abstraction for binary logistic regression training results.
BinaryLogisticRegressionTrainingSummaryImpl - Class in org.apache.spark.ml.classification: Binary logistic regression training results.
BinaryLogisticRegressionTrainingSummaryImpl(Dataset<Row>, String, String, String, String, double[]) - Constructor for class org.apache.spark.ml.classification.BinaryLogisticRegressionTrainingSummaryImpl
binaryMetrics() - Method in interface org.apache.spark.ml.classification.BinaryLogisticRegressionSummary
binaryRecords(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Load data from a flat binary file, assuming the length of each record is constant.
binaryRecords(String, int, Configuration) - Method in class org.apache.spark.SparkContext: Load data from a flat binary file, assuming the length of each record is constant.
binaryRecordsStream(String, int) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream that monitors a Hadoop-compatible filesystem for new files and reads them as flat binary files with fixed record lengths, yielding byte arrays
binaryRecordsStream(String, int) - Method in class org.apache.spark.streaming.StreamingContext: Create an input stream that monitors a Hadoop-compatible filesystem for new files and reads them as flat binary files, assuming a fixed length per record, generating one byte array per record.
BinarySample - Class in org.apache.spark.mllib.stat.test: Class that represents the group and value of a sample.
BinarySample(boolean, double) - Constructor for class org.apache.spark.mllib.stat.test.BinarySample
binarySummary() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel: Gets summary of model on training set.
BinaryType - Class in org.apache.spark.sql.types: The data type representing Array[Byte] values.
BinaryType() - Constructor for class org.apache.spark.sql.types.BinaryType
BinaryType - Static variable in class org.apache.spark.sql.types.DataTypes: Gets the BinaryType object.
Binomial$() - Constructor for class org.apache.spark.ml.regression.GeneralizedLinearRegression.Binomial$
BinomialBounds - Class in org.apache.spark.util.random: Utility functions that help us determine bounds on adjusted sampling rate to guarantee exact sample size with high confidence when sampling without replacement.
BinomialBounds() - Constructor for class org.apache.spark.util.random.BinomialBounds
BisectingKMeans - Class in org.apache.spark.ml.clustering: A bisecting k-means algorithm based on the paper "A comparison of document clustering techniques" by Steinbach, Karypis, and Kumar, with modification to fit Spark.
BisectingKMeans(String) - Constructor for class org.apache.spark.ml.clustering.BisectingKMeans
BisectingKMeans() - Constructor for class org.apache.spark.ml.clustering.BisectingKMeans
BisectingKMeans - Class in org.apache.spark.mllib.clustering: A bisecting k-means algorithm based on the paper "A comparison of document clustering techniques" by Steinbach, Karypis, and Kumar, with modification to fit Spark.
BisectingKMeans() - Constructor for class org.apache.spark.mllib.clustering.BisectingKMeans: Constructs with the default configuration
BisectingKMeansModel - Class in org.apache.spark.ml.clustering: Model fitted by BisectingKMeans.
BisectingKMeansModel - Class in org.apache.spark.mllib.clustering: Clustering model produced by BisectingKMeans.
BisectingKMeansModel(ClusteringTreeNode) - Constructor for class org.apache.spark.mllib.clustering.BisectingKMeansModel
BisectingKMeansModel.SaveLoadV1_0$ - Class in org.apache.spark.mllib.clustering
BisectingKMeansModel.SaveLoadV2_0$ - Class in org.apache.spark.mllib.clustering
BisectingKMeansParams - Interface in org.apache.spark.ml.clustering: Common params for BisectingKMeans and BisectingKMeansModel
BisectingKMeansSummary - Class in org.apache.spark.ml.clustering: :: Experimental :: Summary of BisectingKMeans.
bitSize() - Method in class org.apache.spark.util.sketch.BloomFilter: Returns the number of bits in the underlying bit array.
bitwiseAND(Object) - Method in class org.apache.spark.sql.Column: Compute bitwise AND of this expression with another expression.
bitwiseNOT(Column) - Static method in class org.apache.spark.sql.functions: Computes bitwise NOT (~) of a number.
bitwiseOR(Object) - Method in class org.apache.spark.sql.Column: Compute bitwise OR of this expression with another expression.
bitwiseXOR(Object) - Method in class org.apache.spark.sql.Column: Compute bitwise XOR of this expression with another expression.
BLACKLISTED() - Static method in class org.apache.spark.ui.ToolTips
BlacklistedExecutor - Class in org.apache.spark.scheduler
BlacklistedExecutor(String, long) - Constructor for class org.apache.spark.scheduler.BlacklistedExecutor
blackListedExecutors() - Method in class org.apache.spark.status.LiveStage
blacklistedInStages() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
blacklistedInStages() - Method in class org.apache.spark.status.LiveExecutor
BLAS - Class in org.apache.spark.ml.linalg: BLAS routines for MLlib's vectors and matrices.
BLAS() - Constructor for class org.apache.spark.ml.linalg.BLAS
BLAS - Class in org.apache.spark.mllib.linalg: BLAS routines for MLlib's vectors and matrices.
BLAS() - Constructor for class org.apache.spark.mllib.linalg.BLAS
BlockData - Interface in org.apache.spark.storage: Abstracts away how blocks are stored and provides different ways to read the underlying block data.
blockedByLock() - Method in class org.apache.spark.status.api.v1.ThreadStackTrace
blockedByThreadId() - Method in class org.apache.spark.status.api.v1.ThreadStackTrace
BlockEvictionHandler - Interface in org.apache.spark.storage.memory
BlockGeneratorListener - Interface in org.apache.spark.streaming.receiver: Listener object for BlockGenerator events
BlockId - Class in org.apache.spark.storage: :: DeveloperApi :: Identifies a particular Block of data, usually associated with a single file.
BlockId() - Constructor for class org.apache.spark.storage.BlockId
blockId() - Method in class org.apache.spark.storage.BlockManagerMessages.GetBlockStatus
blockId() - Method in class org.apache.spark.storage.BlockManagerMessages.GetLocations
blockId() - Method in class org.apache.spark.storage.BlockManagerMessages.GetLocationsAndStatus
blockId() - Method in class org.apache.spark.storage.BlockManagerMessages.RemoveBlock
blockId() - Method in class org.apache.spark.storage.BlockManagerMessages.ReplicateBlock
blockId() - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
blockId() - Method in class org.apache.spark.storage.BlockUpdatedInfo
blockId() - Method in interface org.apache.spark.streaming.receiver.ReceivedBlockStoreResult
blockIds() - Method in class org.apache.spark.storage.BlockManagerMessages.GetLocationsMultipleBlockIds
BlockLocationsAndStatus(Seq<BlockManagerId>, BlockStatus) - Constructor for class org.apache.spark.storage.BlockManagerMessages.BlockLocationsAndStatus
BlockLocationsAndStatus$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.BlockLocationsAndStatus$
blockManager() - Method in class org.apache.spark.SparkEnv
blockManagerAddedFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
blockManagerAddedToJson(SparkListenerBlockManagerAdded) - Static method in class org.apache.spark.util.JsonProtocol
BlockManagerHeartbeat(BlockManagerId) - Constructor for class org.apache.spark.storage.BlockManagerMessages.BlockManagerHeartbeat
BlockManagerHeartbeat$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.BlockManagerHeartbeat$
blockManagerId() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
blockManagerId() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerRemoved
BlockManagerId - Class in org.apache.spark.storage: :: DeveloperApi :: This class represent a unique identifier for a BlockManager.
BlockManagerId() - Constructor for class org.apache.spark.storage.BlockManagerId
blockManagerId() - Method in class org.apache.spark.storage.BlockManagerMessages.BlockManagerHeartbeat
blockManagerId() - Method in class org.apache.spark.storage.BlockManagerMessages.GetPeers
blockManagerId() - Method in class org.apache.spark.storage.BlockManagerMessages.RegisterBlockManager
blockManagerId() - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
blockManagerId() - Method in class org.apache.spark.storage.BlockUpdatedInfo
blockManagerIdCache() - Static method in class org.apache.spark.storage.BlockManagerId: The max cache size is hardcoded to 10000, since the size of a BlockManagerId object is about 48B, the total memory cost should be below 1MB which is feasible.
blockManagerIdFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
blockManagerIdToJson(BlockManagerId) - Static method in class org.apache.spark.util.JsonProtocol
BlockManagerMessages - Class in org.apache.spark.storage
BlockManagerMessages() - Constructor for class org.apache.spark.storage.BlockManagerMessages
BlockManagerMessages.BlockLocationsAndStatus - Class in org.apache.spark.storage
BlockManagerMessages.BlockLocationsAndStatus$ - Class in org.apache.spark.storage
BlockManagerMessages.BlockManagerHeartbeat - Class in org.apache.spark.storage
BlockManagerMessages.BlockManagerHeartbeat$ - Class in org.apache.spark.storage
BlockManagerMessages.GetBlockStatus - Class in org.apache.spark.storage
BlockManagerMessages.GetBlockStatus$ - Class in org.apache.spark.storage
BlockManagerMessages.GetExecutorEndpointRef - Class in org.apache.spark.storage
BlockManagerMessages.GetExecutorEndpointRef$ - Class in org.apache.spark.storage
BlockManagerMessages.GetLocations - Class in org.apache.spark.storage
BlockManagerMessages.GetLocations$ - Class in org.apache.spark.storage
BlockManagerMessages.GetLocationsAndStatus - Class in org.apache.spark.storage
BlockManagerMessages.GetLocationsAndStatus$ - Class in org.apache.spark.storage
BlockManagerMessages.GetLocationsMultipleBlockIds - Class in org.apache.spark.storage
BlockManagerMessages.GetLocationsMultipleBlockIds$ - Class in org.apache.spark.storage
BlockManagerMessages.GetMatchingBlockIds - Class in org.apache.spark.storage
BlockManagerMessages.GetMatchingBlockIds$ - Class in org.apache.spark.storage
BlockManagerMessages.GetMemoryStatus$ - Class in org.apache.spark.storage
BlockManagerMessages.GetPeers - Class in org.apache.spark.storage
BlockManagerMessages.GetPeers$ - Class in org.apache.spark.storage
BlockManagerMessages.GetStorageStatus$ - Class in org.apache.spark.storage
BlockManagerMessages.HasCachedBlocks - Class in org.apache.spark.storage
BlockManagerMessages.HasCachedBlocks$ - Class in org.apache.spark.storage
BlockManagerMessages.RegisterBlockManager - Class in org.apache.spark.storage
BlockManagerMessages.RegisterBlockManager$ - Class in org.apache.spark.storage
BlockManagerMessages.RemoveBlock - Class in org.apache.spark.storage
BlockManagerMessages.RemoveBlock$ - Class in org.apache.spark.storage
BlockManagerMessages.RemoveBroadcast - Class in org.apache.spark.storage
BlockManagerMessages.RemoveBroadcast$ - Class in org.apache.spark.storage
BlockManagerMessages.RemoveExecutor - Class in org.apache.spark.storage
BlockManagerMessages.RemoveExecutor$ - Class in org.apache.spark.storage
BlockManagerMessages.RemoveRdd - Class in org.apache.spark.storage
BlockManagerMessages.RemoveRdd$ - Class in org.apache.spark.storage
BlockManagerMessages.RemoveShuffle - Class in org.apache.spark.storage
BlockManagerMessages.RemoveShuffle$ - Class in org.apache.spark.storage
BlockManagerMessages.ReplicateBlock - Class in org.apache.spark.storage
BlockManagerMessages.ReplicateBlock$ - Class in org.apache.spark.storage
BlockManagerMessages.StopBlockManagerMaster$ - Class in org.apache.spark.storage
BlockManagerMessages.ToBlockManagerMaster - Interface in org.apache.spark.storage
BlockManagerMessages.ToBlockManagerSlave - Interface in org.apache.spark.storage
BlockManagerMessages.TriggerThreadDump$ - Class in org.apache.spark.storage: Driver to Executor message to trigger a thread dump.
BlockManagerMessages.UpdateBlockInfo - Class in org.apache.spark.storage
BlockManagerMessages.UpdateBlockInfo$ - Class in org.apache.spark.storage
blockManagerRemovedFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
blockManagerRemovedToJson(SparkListenerBlockManagerRemoved) - Static method in class org.apache.spark.util.JsonProtocol
BlockMatrix - Class in org.apache.spark.mllib.linalg.distributed: Represents a distributed matrix in blocks of local matrices.
BlockMatrix(RDD<Tuple2<Tuple2<Object, Object>, Matrix>>, int, int, long, long) - Constructor for class org.apache.spark.mllib.linalg.distributed.BlockMatrix
BlockMatrix(RDD<Tuple2<Tuple2<Object, Object>, Matrix>>, int, int) - Constructor for class org.apache.spark.mllib.linalg.distributed.BlockMatrix: Alternate constructor for BlockMatrix without the input of the number of rows and columns.
blockName() - Method in class org.apache.spark.status.api.v1.RDDPartitionInfo
blockName() - Method in class org.apache.spark.status.LiveRDDPartition
BlockNotFoundException - Exception in org.apache.spark.storage
BlockNotFoundException(String) - Constructor for exception org.apache.spark.storage.BlockNotFoundException
BlockReplicationPolicy - Interface in org.apache.spark.storage: ::DeveloperApi:: BlockReplicationPrioritization provides logic for prioritizing a sequence of peers for replicating blocks.
BlockReplicationUtils - Class in org.apache.spark.storage
BlockReplicationUtils() - Constructor for class org.apache.spark.storage.BlockReplicationUtils
blocks() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
blockSize() - Method in interface org.apache.spark.ml.classification.MultilayerPerceptronParams: Block size for stacking input data in matrices to speed up the computation.
BlockStatus - Class in org.apache.spark.storage
BlockStatus(StorageLevel, long, long) - Constructor for class org.apache.spark.storage.BlockStatus
blockStatusFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
blockStatusToJson(BlockStatus) - Static method in class org.apache.spark.util.JsonProtocol
blockUpdatedInfo() - Method in class org.apache.spark.scheduler.SparkListenerBlockUpdated
BlockUpdatedInfo - Class in org.apache.spark.storage: :: DeveloperApi :: Stores information about a block status in a block manager.
BlockUpdatedInfo(BlockManagerId, BlockId, StorageLevel, long, long) - Constructor for class org.apache.spark.storage.BlockUpdatedInfo
blockUpdatedInfoFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
blockUpdatedInfoToJson(BlockUpdatedInfo) - Static method in class org.apache.spark.util.JsonProtocol
blockUpdateFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
blockUpdateToJson(SparkListenerBlockUpdated) - Static method in class org.apache.spark.util.JsonProtocol
bloomFilter(String, long, double) - Method in class org.apache.spark.sql.DataFrameStatFunctions: Builds a Bloom filter over a specified column.
bloomFilter(Column, long, double) - Method in class org.apache.spark.sql.DataFrameStatFunctions: Builds a Bloom filter over a specified column.
bloomFilter(String, long, long) - Method in class org.apache.spark.sql.DataFrameStatFunctions: Builds a Bloom filter over a specified column.
bloomFilter(Column, long, long) - Method in class org.apache.spark.sql.DataFrameStatFunctions: Builds a Bloom filter over a specified column.
BloomFilter - Class in org.apache.spark.util.sketch: A Bloom filter is a space-efficient probabilistic data structure that offers an approximate containment test with one-sided error: if it claims that an item is contained in it, this might be in error, but if it claims that an item is not contained in it, then this is definitely true.
BloomFilter() - Constructor for class org.apache.spark.util.sketch.BloomFilter
BloomFilter.Version - Enum in org.apache.spark.util.sketch
bmAddress() - Method in class org.apache.spark.FetchFailed
BOOLEAN() - Static method in class org.apache.spark.sql.Encoders: An encoder for nullable boolean type.
BooleanParam - Class in org.apache.spark.ml.param: :: DeveloperApi :: Specialized version of Param[Boolean] for Java.
BooleanParam(String, String, String) - Constructor for class org.apache.spark.ml.param.BooleanParam
BooleanParam(Identifiable, String, String) - Constructor for class org.apache.spark.ml.param.BooleanParam
BooleanType - Class in org.apache.spark.sql.types: The data type representing Boolean values.
BooleanType() - Constructor for class org.apache.spark.sql.types.BooleanType
BooleanType - Static variable in class org.apache.spark.sql.types.DataTypes: Gets the BooleanType object.
boost(RDD<LabeledPoint>, RDD<LabeledPoint>, BoostingStrategy, boolean, long, String) - Static method in class org.apache.spark.ml.tree.impl.GradientBoostedTrees: Internal method for performing regression using trees as base learners.
BoostingStrategy - Class in org.apache.spark.mllib.tree.configuration: Configuration options for GradientBoostedTrees.
BoostingStrategy(Strategy, Loss, int, double, double) - Constructor for class org.apache.spark.mllib.tree.configuration.BoostingStrategy
Both() - Static method in class org.apache.spark.graphx.EdgeDirection: Edges originating from *and* arriving at a vertex of interest.
boundaries() - Method in class org.apache.spark.ml.regression.IsotonicRegressionModel: Boundaries in increasing order for which predictions are known.
boundaries() - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
BoundedDouble - Class in org.apache.spark.partial: A Double value with error bars and associated confidence.
BoundedDouble(double, double, double, double) - Constructor for class org.apache.spark.partial.BoundedDouble
BreezeUtil - Class in org.apache.spark.ml.ann: In-place DGEMM and DGEMV for Breeze
BreezeUtil() - Constructor for class org.apache.spark.ml.ann.BreezeUtil
broadcast(T) - Method in class org.apache.spark.api.java.JavaSparkContext: Broadcast a read-only variable to the cluster, returning a Broadcast object for reading it in distributed functions.
Broadcast<T> - Class in org.apache.spark.broadcast: A broadcast variable.
Broadcast(long, ClassTag<T>) - Constructor for class org.apache.spark.broadcast.Broadcast
broadcast(T, ClassTag<T>) - Method in class org.apache.spark.SparkContext: Broadcast a read-only variable to the cluster, returning a Broadcast object for reading it in distributed functions.
broadcast(Dataset<T>) - Static method in class org.apache.spark.sql.functions: Marks a DataFrame as small enough for use in broadcast joins.
BROADCAST() - Static method in class org.apache.spark.storage.BlockId
BroadcastBlockId - Class in org.apache.spark.storage
BroadcastBlockId(long, String) - Constructor for class org.apache.spark.storage.BroadcastBlockId
broadcastCleaned(long) - Method in interface org.apache.spark.CleanerListener
BroadcastFactory - Interface in org.apache.spark.broadcast: An interface for all the broadcast implementations in Spark (to allow multiple broadcast implementations).
broadcastId() - Method in class org.apache.spark.CleanBroadcast
broadcastId() - Method in class org.apache.spark.storage.BlockManagerMessages.RemoveBroadcast
broadcastId() - Method in class org.apache.spark.storage.BroadcastBlockId
broadcastManager() - Method in class org.apache.spark.SparkEnv
bround(Column) - Static method in class org.apache.spark.sql.functions: Returns the value of the column e rounded to 0 decimal places with HALF_EVEN round mode.
bround(Column, int) - Static method in class org.apache.spark.sql.functions: Round the value of e to scale decimal places with HALF_EVEN round mode if scale is greater than or equal to 0 or at integral part when scale is less than 0.
bucketBy(int, String, String...) - Method in class org.apache.spark.sql.DataFrameWriter: Buckets the output by the given columns.
bucketBy(int, String, Seq<String>) - Method in class org.apache.spark.sql.DataFrameWriter: Buckets the output by the given columns.
BucketedRandomProjectionLSH - Class in org.apache.spark.ml.feature: :: Experimental ::
BucketedRandomProjectionLSH(String) - Constructor for class org.apache.spark.ml.feature.BucketedRandomProjectionLSH
BucketedRandomProjectionLSH() - Constructor for class org.apache.spark.ml.feature.BucketedRandomProjectionLSH
BucketedRandomProjectionLSHModel - Class in org.apache.spark.ml.feature: :: Experimental ::
BucketedRandomProjectionLSHParams - Interface in org.apache.spark.ml.feature: :: Experimental ::
Bucketizer - Class in org.apache.spark.ml.feature: Bucketizer maps a column of continuous features to a column of feature buckets.
Bucketizer(String) - Constructor for class org.apache.spark.ml.feature.Bucketizer
Bucketizer() - Constructor for class org.apache.spark.ml.feature.Bucketizer
bucketLength() - Method in interface org.apache.spark.ml.feature.BucketedRandomProjectionLSHParams: The length of each hash bucket, a larger bucket lowers the false negative rate.
buf() - Method in class org.apache.spark.sql.hive.HiveUDAFBuffer
buffer() - Method in class org.apache.spark.storage.memory.SerializedMemoryEntry
bufferEncoder() - Method in class org.apache.spark.sql.expressions.Aggregator: Specifies the Encoder for the intermediate value type.
BufferReleasingInputStream - Class in org.apache.spark.storage: Helper class that ensures a ManagedBuffer is released upon InputStream.close()
BufferReleasingInputStream(InputStream, ShuffleBlockFetcherIterator) - Constructor for class org.apache.spark.storage.BufferReleasingInputStream
bufferSchema() - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction: A StructType represents data types of values in the aggregation buffer.
build(Node, int) - Method in class org.apache.spark.ml.tree.DecisionTreeModelReadWrite.NodeData$: Create DecisionTreeModelReadWrite.NodeData instances for this node and all children.
build(DecisionTreeModel, int) - Method in class org.apache.spark.ml.tree.EnsembleModelReadWrite.EnsembleNodeData$: Create EnsembleModelReadWrite.EnsembleNodeData instances for the given tree.
build() - Method in class org.apache.spark.ml.tuning.ParamGridBuilder: Builds and returns all combinations of parameters specified by the param grid.
build() - Method in class org.apache.spark.sql.types.MetadataBuilder: Builds the Metadata instance.
build() - Method in interface org.apache.spark.storage.memory.MemoryEntryBuilder
build() - Method in class org.apache.spark.streaming.kinesis.SparkAWSCredentials.Builder: Returns the appropriate instance of SparkAWSCredentials given the configured parameters.
builder() - Static method in class org.apache.spark.sql.SparkSession: Creates a SparkSession.Builder for constructing a SparkSession.
Builder() - Constructor for class org.apache.spark.sql.SparkSession.Builder
Builder() - Constructor for class org.apache.spark.streaming.kinesis.SparkAWSCredentials.Builder
buildErrorResponse(Response.Status, String) - Static method in class org.apache.spark.ui.UIUtils
buildPools() - Method in interface org.apache.spark.scheduler.SchedulableBuilder
buildReader(SparkSession, StructType, StructType, StructType, Seq<Filter>, Map<String, String>, Configuration) - Method in class org.apache.spark.sql.hive.orc.OrcFileFormat
buildScan(Seq<Attribute>, Seq<Expression>) - Method in interface org.apache.spark.sql.sources.CatalystScan
buildScan(String[], Filter[]) - Method in interface org.apache.spark.sql.sources.PrunedFilteredScan
buildScan(String[]) - Method in interface org.apache.spark.sql.sources.PrunedScan
buildScan() - Method in interface org.apache.spark.sql.sources.TableScan
buildTreeFromNodes(DecisionTreeModelReadWrite.NodeData[], String) - Static method in class org.apache.spark.ml.tree.DecisionTreeModelReadWrite: Given all data for all nodes in a tree, rebuild the tree.
builtinHiveVersion() - Static method in class org.apache.spark.sql.hive.HiveUtils: The version of hive used internally by Spark SQL.
BYTE() - Static method in class org.apache.spark.api.r.SerializationFormats
BYTE() - Static method in class org.apache.spark.sql.Encoders: An encoder for nullable byte type.
BytecodeUtils - Class in org.apache.spark.graphx.util: Includes an utility function to test whether a function accesses a specific attribute of an object.
BytecodeUtils() - Constructor for class org.apache.spark.graphx.util.BytecodeUtils
byteFromString(String, ByteUnit) - Static method in class org.apache.spark.internal.config.ConfigHelpers
BYTES_READ() - Method in class org.apache.spark.InternalAccumulator.input$
BYTES_WRITTEN() - Method in class org.apache.spark.InternalAccumulator.output$
BYTES_WRITTEN() - Method in class org.apache.spark.InternalAccumulator.shuffleWrite$
bytesRead() - Method in class org.apache.spark.status.api.v1.InputMetricDistributions
bytesRead() - Method in class org.apache.spark.status.api.v1.InputMetrics
bytesToString(long) - Static method in class org.apache.spark.util.Utils: Convert a quantity in bytes to a human-readable string such as "4.0 MB".
bytesToString(BigInt) - Static method in class org.apache.spark.util.Utils
byteStringAsBytes(String) - Static method in class org.apache.spark.util.Utils: Convert a passed byte string (e.g.
byteStringAsGb(String) - Static method in class org.apache.spark.util.Utils: Convert a passed byte string (e.g.
byteStringAsKb(String) - Static method in class org.apache.spark.util.Utils: Convert a passed byte string (e.g.
byteStringAsMb(String) - Static method in class org.apache.spark.util.Utils: Convert a passed byte string (e.g.
bytesWritten() - Method in class org.apache.spark.status.api.v1.OutputMetricDistributions
bytesWritten() - Method in class org.apache.spark.status.api.v1.OutputMetrics
bytesWritten() - Method in class org.apache.spark.status.api.v1.ShuffleWriteMetrics
bytesWritten(long) - Method in interface org.apache.spark.util.logging.RollingPolicy: Notify that bytes have been written
byteToString(long, ByteUnit) - Static method in class org.apache.spark.internal.config.ConfigHelpers
ByteType - Class in org.apache.spark.sql.types: The data type representing Byte values.
ByteType() - Constructor for class org.apache.spark.sql.types.ByteType
ByteType - Static variable in class org.apache.spark.sql.types.DataTypes: Gets the ByteType object.

C

cache() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Persist this RDD with the default storage level (MEMORY_ONLY).
cache() - Method in class org.apache.spark.api.java.JavaPairRDD: Persist this RDD with the default storage level (MEMORY_ONLY).
cache() - Method in class org.apache.spark.api.java.JavaRDD: Persist this RDD with the default storage level (MEMORY_ONLY).
cache() - Method in class org.apache.spark.graphx.Graph: Caches the vertices and edges associated with this graph at the previously-specified target storage levels, which default to MEMORY_ONLY.
cache() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl: Persists the edge partitions using targetStorageLevel, which defaults to MEMORY_ONLY.
cache() - Method in class org.apache.spark.graphx.impl.GraphImpl
cache() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl: Persists the vertex partitions at targetStorageLevel, which defaults to MEMORY_ONLY.
cache() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix: Caches the underlying RDD.
cache() - Method in class org.apache.spark.rdd.RDD: Persist this RDD with the default storage level (MEMORY_ONLY).
cache() - Method in class org.apache.spark.sql.Dataset: Persist this Dataset with the default storage level (MEMORY_AND_DISK).
cache() - Method in class org.apache.spark.streaming.api.java.JavaDStream: Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
cache() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
cache() - Method in class org.apache.spark.streaming.dstream.DStream: Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
cacheNodeIds() - Method in interface org.apache.spark.ml.tree.DecisionTreeParams: If false, the algorithm will pass trees to executors to match instances with nodes.
cacheSize() - Method in interface org.apache.spark.SparkExecutorInfo
cacheSize() - Method in class org.apache.spark.SparkExecutorInfoImpl
cacheTable(String) - Method in class org.apache.spark.sql.catalog.Catalog: Caches the specified table in-memory.
cacheTable(String, StorageLevel) - Method in class org.apache.spark.sql.catalog.Catalog: Caches the specified table with the given storage level.
cacheTable(String) - Method in class org.apache.spark.sql.SQLContext: Caches the specified table in-memory.
calculate(DenseVector<Object>) - Method in class org.apache.spark.ml.regression.AFTCostFun
calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Entropy: :: DeveloperApi :: information calculation for multiclass classification
calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Entropy: :: DeveloperApi :: variance calculation
calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Gini: :: DeveloperApi :: information calculation for multiclass classification
calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Gini: :: DeveloperApi :: variance calculation
calculate(double[], double) - Method in interface org.apache.spark.mllib.tree.impurity.Impurity: :: DeveloperApi :: information calculation for multiclass classification
calculate(double, double, double) - Method in interface org.apache.spark.mllib.tree.impurity.Impurity: :: DeveloperApi :: information calculation for regression
calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Variance: :: DeveloperApi :: information calculation for multiclass classification
calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Variance: :: DeveloperApi :: variance calculation
calculateNumberOfPartitions(long, int, int) - Method in class org.apache.spark.ml.feature.Word2VecModel.Word2VecModelWriter$: Calculate the number of partitions to use in saving the model.
CalendarIntervalType - Class in org.apache.spark.sql.types: The data type representing calendar time intervals.
CalendarIntervalType() - Constructor for class org.apache.spark.sql.types.CalendarIntervalType
CalendarIntervalType - Static variable in class org.apache.spark.sql.types.DataTypes: Gets the CalendarIntervalType object.
call(K, Iterator<V1>, Iterator<V2>) - Method in interface org.apache.spark.api.java.function.CoGroupFunction
call(T) - Method in interface org.apache.spark.api.java.function.DoubleFlatMapFunction
call(T) - Method in interface org.apache.spark.api.java.function.DoubleFunction
call(T) - Method in interface org.apache.spark.api.java.function.FilterFunction
call(T) - Method in interface org.apache.spark.api.java.function.FlatMapFunction
call(T1, T2) - Method in interface org.apache.spark.api.java.function.FlatMapFunction2
call(K, Iterator<V>) - Method in interface org.apache.spark.api.java.function.FlatMapGroupsFunction
call(K, Iterator<V>, GroupState<S>) - Method in interface org.apache.spark.api.java.function.FlatMapGroupsWithStateFunction
call(T) - Method in interface org.apache.spark.api.java.function.ForeachFunction
call(Iterator<T>) - Method in interface org.apache.spark.api.java.function.ForeachPartitionFunction
call(T1) - Method in interface org.apache.spark.api.java.function.Function
call() - Method in interface org.apache.spark.api.java.function.Function0
call(T1, T2) - Method in interface org.apache.spark.api.java.function.Function2
call(T1, T2, T3) - Method in interface org.apache.spark.api.java.function.Function3
call(T1, T2, T3, T4) - Method in interface org.apache.spark.api.java.function.Function4
call(T) - Method in interface org.apache.spark.api.java.function.MapFunction
call(K, Iterator<V>) - Method in interface org.apache.spark.api.java.function.MapGroupsFunction
call(K, Iterator<V>, GroupState<S>) - Method in interface org.apache.spark.api.java.function.MapGroupsWithStateFunction
call(Iterator<T>) - Method in interface org.apache.spark.api.java.function.MapPartitionsFunction
call(T) - Method in interface org.apache.spark.api.java.function.PairFlatMapFunction
call(T) - Method in interface org.apache.spark.api.java.function.PairFunction
call(T, T) - Method in interface org.apache.spark.api.java.function.ReduceFunction
call(T) - Method in interface org.apache.spark.api.java.function.VoidFunction
call(T1, T2) - Method in interface org.apache.spark.api.java.function.VoidFunction2
call() - Method in interface org.apache.spark.sql.api.java.UDF0
call(T1) - Method in interface org.apache.spark.sql.api.java.UDF1
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10) - Method in interface org.apache.spark.sql.api.java.UDF10
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11) - Method in interface org.apache.spark.sql.api.java.UDF11
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12) - Method in interface org.apache.spark.sql.api.java.UDF12
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13) - Method in interface org.apache.spark.sql.api.java.UDF13
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14) - Method in interface org.apache.spark.sql.api.java.UDF14
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15) - Method in interface org.apache.spark.sql.api.java.UDF15
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16) - Method in interface org.apache.spark.sql.api.java.UDF16
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17) - Method in interface org.apache.spark.sql.api.java.UDF17
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18) - Method in interface org.apache.spark.sql.api.java.UDF18
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19) - Method in interface org.apache.spark.sql.api.java.UDF19
call(T1, T2) - Method in interface org.apache.spark.sql.api.java.UDF2
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20) - Method in interface org.apache.spark.sql.api.java.UDF20
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20, T21) - Method in interface org.apache.spark.sql.api.java.UDF21
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20, T21, T22) - Method in interface org.apache.spark.sql.api.java.UDF22
call(T1, T2, T3) - Method in interface org.apache.spark.sql.api.java.UDF3
call(T1, T2, T3, T4) - Method in interface org.apache.spark.sql.api.java.UDF4
call(T1, T2, T3, T4, T5) - Method in interface org.apache.spark.sql.api.java.UDF5
call(T1, T2, T3, T4, T5, T6) - Method in interface org.apache.spark.sql.api.java.UDF6
call(T1, T2, T3, T4, T5, T6, T7) - Method in interface org.apache.spark.sql.api.java.UDF7
call(T1, T2, T3, T4, T5, T6, T7, T8) - Method in interface org.apache.spark.sql.api.java.UDF8
call(T1, T2, T3, T4, T5, T6, T7, T8, T9) - Method in interface org.apache.spark.sql.api.java.UDF9
callSite() - Method in class org.apache.spark.storage.RDDInfo
callUDF(String, Column...) - Static method in class org.apache.spark.sql.functions: Call an user-defined function.
callUDF(String, Seq<Column>) - Static method in class org.apache.spark.sql.functions: Call an user-defined function.
cancel() - Method in class org.apache.spark.ComplexFutureAction
cancel() - Method in interface org.apache.spark.FutureAction: Cancels the execution of this action.
cancel() - Method in class org.apache.spark.SimpleFutureAction
cancelAllJobs() - Method in class org.apache.spark.api.java.JavaSparkContext: Cancel all jobs that have been scheduled or are running.
cancelAllJobs() - Method in class org.apache.spark.SparkContext: Cancel all jobs that have been scheduled or are running.
cancelJob(int, String) - Method in class org.apache.spark.SparkContext: Cancel a given job if it's scheduled or running.
cancelJob(int) - Method in class org.apache.spark.SparkContext: Cancel a given job if it's scheduled or running.
cancelJobGroup(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Cancel active jobs for the specified group.
cancelJobGroup(String) - Method in class org.apache.spark.SparkContext: Cancel active jobs for the specified group.
cancelStage(int, String) - Method in class org.apache.spark.SparkContext: Cancel a given stage and all jobs associated with it.
cancelStage(int) - Method in class org.apache.spark.SparkContext: Cancel a given stage and all jobs associated with it.
cancelTasks(int, boolean) - Method in interface org.apache.spark.scheduler.TaskScheduler
canCreate(String) - Method in interface org.apache.spark.scheduler.ExternalClusterManager: Check if this cluster manager instance can create scheduler components for a certain master URL.
canDoMerge() - Method in class org.apache.spark.sql.hive.HiveUDAFBuffer
canEqual(Object) - Static method in class org.apache.spark.ExpireDeadHosts
canEqual(Object) - Static method in class org.apache.spark.ml.feature.Dot
canEqual(Object) - Static method in class org.apache.spark.Resubmitted
canEqual(Object) - Static method in class org.apache.spark.rpc.netty.OnStart
canEqual(Object) - Static method in class org.apache.spark.rpc.netty.OnStop
canEqual(Object) - Static method in class org.apache.spark.scheduler.AllJobsCancelled
canEqual(Object) - Method in class org.apache.spark.scheduler.cluster.ExecutorInfo
canEqual(Object) - Static method in class org.apache.spark.scheduler.JobSucceeded
canEqual(Object) - Static method in class org.apache.spark.scheduler.ResubmitFailedStages
canEqual(Object) - Static method in class org.apache.spark.scheduler.StopCoordinator
canEqual(Object) - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
canEqual(Object) - Static method in class org.apache.spark.sql.jdbc.OracleDialect
canEqual(Object) - Static method in class org.apache.spark.sql.jdbc.TeradataDialect
canEqual(Object) - Static method in class org.apache.spark.sql.types.BinaryType
canEqual(Object) - Static method in class org.apache.spark.sql.types.BooleanType
canEqual(Object) - Static method in class org.apache.spark.sql.types.ByteType
canEqual(Object) - Static method in class org.apache.spark.sql.types.CalendarIntervalType
canEqual(Object) - Static method in class org.apache.spark.sql.types.DateType
canEqual(Object) - Static method in class org.apache.spark.sql.types.DoubleType
canEqual(Object) - Static method in class org.apache.spark.sql.types.FloatType
canEqual(Object) - Static method in class org.apache.spark.sql.types.IntegerType
canEqual(Object) - Static method in class org.apache.spark.sql.types.LongType
canEqual(Object) - Static method in class org.apache.spark.sql.types.NullType
canEqual(Object) - Static method in class org.apache.spark.sql.types.ShortType
canEqual(Object) - Static method in class org.apache.spark.sql.types.StringType
canEqual(Object) - Static method in class org.apache.spark.sql.types.TimestampType
canEqual(Object) - Static method in class org.apache.spark.StopMapOutputTracker
canEqual(Object) - Static method in class org.apache.spark.streaming.kinesis.DefaultCredentials
canEqual(Object) - Static method in class org.apache.spark.streaming.scheduler.AllReceiverIds
canEqual(Object) - Static method in class org.apache.spark.streaming.scheduler.GetAllReceiverInfo
canEqual(Object) - Static method in class org.apache.spark.streaming.scheduler.StopAllReceivers
canEqual(Object) - Static method in class org.apache.spark.Success
canEqual(Object) - Static method in class org.apache.spark.TaskResultLost
canEqual(Object) - Static method in class org.apache.spark.TaskSchedulerIsSet
canEqual(Object) - Static method in class org.apache.spark.UnknownReason
canEqual(Object) - Method in class org.apache.spark.util.MutablePair
canHandle(String) - Method in class org.apache.spark.sql.jdbc.AggregatedDialect
canHandle(String) - Static method in class org.apache.spark.sql.jdbc.DB2Dialect
canHandle(String) - Static method in class org.apache.spark.sql.jdbc.DerbyDialect
canHandle(String) - Method in class org.apache.spark.sql.jdbc.JdbcDialect: Check if this dialect instance can handle a certain jdbc url.
canHandle(String) - Static method in class org.apache.spark.sql.jdbc.MsSqlServerDialect
canHandle(String) - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
canHandle(String) - Static method in class org.apache.spark.sql.jdbc.NoopDialect
canHandle(String) - Static method in class org.apache.spark.sql.jdbc.OracleDialect
canHandle(String) - Static method in class org.apache.spark.sql.jdbc.PostgresDialect
canHandle(String) - Static method in class org.apache.spark.sql.jdbc.TeradataDialect
CanonicalRandomVertexCut$() - Constructor for class org.apache.spark.graphx.PartitionStrategy.CanonicalRandomVertexCut$
canWrite(DataType, DataType, Function2<String, String, Object>, String, Function1<String, BoxedUnit>) - Static method in class org.apache.spark.sql.types.DataType: Returns true if the write data type can be read using the read data type.
cartesian(JavaRDDLike<U, ?>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of elements (a, b) where a is in this and b is in other.
cartesian(RDD, ClassTag) - Method in class org.apache.spark.rdd.RDD: Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of elements (a, b) where a is in this and b is in other.
caseSensitive() - Method in class org.apache.spark.ml.feature.StopWordsRemover: Whether to do a case sensitive comparison over the stop words.
cast(DataType) - Method in class org.apache.spark.sql.Column: Casts the column to a different data type.
cast(String) - Method in class org.apache.spark.sql.Column: Casts the column to a different data type, using the canonical string representation of the type.
Catalog - Class in org.apache.spark.sql.catalog: Catalog interface for Spark.
Catalog() - Constructor for class org.apache.spark.sql.catalog.Catalog
catalog() - Method in class org.apache.spark.sql.SparkSession: Interface through which the user may create, drop, alter or query underlying databases, tables, functions etc.
catalogString() - Method in class org.apache.spark.sql.types.ArrayType
catalogString() - Static method in class org.apache.spark.sql.types.BinaryType
catalogString() - Static method in class org.apache.spark.sql.types.BooleanType
catalogString() - Static method in class org.apache.spark.sql.types.ByteType
catalogString() - Static method in class org.apache.spark.sql.types.CalendarIntervalType
catalogString() - Method in class org.apache.spark.sql.types.DataType: String representation for the type saved in external catalogs.
catalogString() - Static method in class org.apache.spark.sql.types.DateType
catalogString() - Static method in class org.apache.spark.sql.types.DoubleType
catalogString() - Static method in class org.apache.spark.sql.types.FloatType
catalogString() - Static method in class org.apache.spark.sql.types.IntegerType
catalogString() - Static method in class org.apache.spark.sql.types.LongType
catalogString() - Method in class org.apache.spark.sql.types.MapType
catalogString() - Static method in class org.apache.spark.sql.types.NullType
catalogString() - Static method in class org.apache.spark.sql.types.ShortType
catalogString() - Static method in class org.apache.spark.sql.types.StringType
catalogString() - Method in class org.apache.spark.sql.types.StructType
catalogString() - Static method in class org.apache.spark.sql.types.TimestampType
CatalystScan - Interface in org.apache.spark.sql.sources: ::Experimental:: An interface for experimenting with a more direct connection to the query planner.
Categorical() - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
categoricalCols() - Method in class org.apache.spark.ml.feature.FeatureHasher: Numeric columns to treat as categorical features.
categoricalFeaturesInfo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
CategoricalSplit - Class in org.apache.spark.ml.tree: Split which tests a categorical feature.
categories() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData
categories() - Method in class org.apache.spark.mllib.tree.model.Split
categoryMaps() - Method in class org.apache.spark.ml.feature.VectorIndexerModel
categorySizes() - Method in class org.apache.spark.ml.feature.OneHotEncoderModel
cause() - Method in exception org.apache.spark.sql.AnalysisException
cause() - Method in exception org.apache.spark.sql.streaming.StreamingQueryException
CausedBy - Class in org.apache.spark.util: Extractor Object for pulling out the root cause of an error.
CausedBy() - Constructor for class org.apache.spark.util.CausedBy
cbrt(Column) - Static method in class org.apache.spark.sql.functions: Computes the cube-root of the given value.
cbrt(String) - Static method in class org.apache.spark.sql.functions: Computes the cube-root of the given column.
ceil(Column) - Static method in class org.apache.spark.sql.functions: Computes the ceiling of the given value.
ceil(String) - Static method in class org.apache.spark.sql.functions: Computes the ceiling of the given column.
ceil() - Method in class org.apache.spark.sql.types.Decimal
censorCol() - Method in interface org.apache.spark.ml.regression.AFTSurvivalRegressionParams: Param for censor column name.
chainl1(Function0<Parsers.Parser<T>>, Function0<Parsers.Parser<Function2<T, T, T>>>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
chainl1(Function0<Parsers.Parser<T>>, Function0<Parsers.Parser>, Function0<Parsers.Parser<Function2<T, U, T>>>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
chainr1(Function0<Parsers.Parser<T>>, Function0<Parsers.Parser<Function2<T, U, U>>>, Function2<T, U, U>, U) - Static method in class org.apache.spark.ml.feature.RFormulaParser
changePrecision(int, int) - Method in class org.apache.spark.sql.types.Decimal: Update precision and scale while keeping our value the same, and return true if successful.
channelRead0(ChannelHandlerContext, byte[]) - Method in class org.apache.spark.api.r.RBackendAuthHandler
CharType - Class in org.apache.spark.sql.types: Hive char type.
CharType(int) - Constructor for class org.apache.spark.sql.types.CharType
checkAndGetK8sMasterUrl(String) - Static method in class org.apache.spark.util.Utils: Check the validity of the given Kubernetes master URL and return the resolved URL.
checkColumnNameDuplication(Seq<String>, String, Function2<String, String, Object>) - Static method in class org.apache.spark.sql.util.SchemaUtils: Checks if input column names have duplicate identifiers.
checkColumnNameDuplication(Seq<String>, String, boolean) - Static method in class org.apache.spark.sql.util.SchemaUtils: Checks if input column names have duplicate identifiers.
checkColumnType(StructType, String, DataType, String) - Static method in class org.apache.spark.ml.util.SchemaUtils: Check whether the given schema contains a column of the required data type.
checkColumnTypes(StructType, String, Seq<DataType>, String) - Static method in class org.apache.spark.ml.util.SchemaUtils: Check whether the given schema contains a column of one of the require data types.
checkDataColumns(RFormula, Dataset<?>) - Static method in class org.apache.spark.ml.r.RWrapperUtils: DataFrame column check.
checkedCast() - Method in interface org.apache.spark.ml.recommendation.ALSModelParams: Attempts to safely cast a user/item id to an Int.
checkFileExists(String, Configuration) - Static method in class org.apache.spark.streaming.util.HdfsUtils: Check if the file exists at the given path.
checkHost(String) - Static method in class org.apache.spark.util.Utils
checkHostPort(String) - Static method in class org.apache.spark.util.Utils
checkNumericType(StructType, String, String) - Static method in class org.apache.spark.ml.util.SchemaUtils: Check whether the given schema contains a column of the numeric data type.
checkpoint() - Method in interface org.apache.spark.api.java.JavaRDDLike: Mark this RDD for checkpointing.
checkpoint() - Method in class org.apache.spark.graphx.Graph: Mark this Graph for checkpointing.
checkpoint() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
checkpoint() - Method in class org.apache.spark.graphx.impl.GraphImpl
checkpoint() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
checkpoint() - Method in class org.apache.spark.rdd.HadoopRDD
checkpoint() - Method in class org.apache.spark.rdd.RDD: Mark this RDD for checkpointing.
checkpoint() - Method in class org.apache.spark.sql.Dataset: Eagerly checkpoint a Dataset and return the new Dataset.
checkpoint(boolean) - Method in class org.apache.spark.sql.Dataset: Returns a checkpointed version of this Dataset.
checkpoint(Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Enable periodic checkpointing of RDDs of this DStream.
checkpoint(String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Sets the context to periodically checkpoint the DStream operations for master fault-tolerance.
checkpoint(Duration) - Method in class org.apache.spark.streaming.dstream.DStream: Enable periodic checkpointing of RDDs of this DStream
checkpoint(String) - Method in class org.apache.spark.streaming.StreamingContext: Set the context to periodically checkpoint the DStream operations for driver fault-tolerance.
checkpointCleaned(long) - Method in interface org.apache.spark.CleanerListener
Checkpointed() - Static method in class org.apache.spark.rdd.CheckpointState
CheckpointingInProgress() - Static method in class org.apache.spark.rdd.CheckpointState
checkpointInterval() - Method in interface org.apache.spark.ml.param.shared.HasCheckpointInterval: Param for set checkpoint interval (>= 1) or disable checkpoint (-1).
checkpointInterval() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
CheckpointReader - Class in org.apache.spark.streaming
CheckpointReader() - Constructor for class org.apache.spark.streaming.CheckpointReader
CheckpointState - Class in org.apache.spark.rdd: Enumeration to manage state transitions of an RDD through checkpointing
CheckpointState() - Constructor for class org.apache.spark.rdd.CheckpointState
checkSchemaColumnNameDuplication(StructType, String, boolean) - Static method in class org.apache.spark.sql.util.SchemaUtils: Checks if an input schema has duplicate column names.
checkSingleVsMultiColumnParams(Params, Seq<Param<?>>, Seq<Param<?>>) - Static method in class org.apache.spark.ml.param.ParamValidators: Utility for Param validity checks for Transformers which have both single- and multi-column support.
checkSpeculatableTasks(int) - Method in interface org.apache.spark.scheduler.Schedulable
checkState(boolean, Function0<String>) - Static method in class org.apache.spark.streaming.util.HdfsUtils
checkThresholdConsistency() - Method in interface org.apache.spark.ml.classification.LogisticRegressionParams: If threshold and thresholds are both set, ensures they are consistent.
child() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformationExec
child() - Method in class org.apache.spark.sql.sources.Not
CHILD_CONNECTION_TIMEOUT - Static variable in class org.apache.spark.launcher.SparkLauncher: Maximum time (in ms) to wait for a child process to connect back to the launcher server when using @link{#start()}.
CHILD_PROCESS_LOGGER_NAME - Static variable in class org.apache.spark.launcher.SparkLauncher: Logger name to use when launching a child process.
ChildFirstURLClassLoader - Class in org.apache.spark.util: A mutable class loader that gives preference to its own URLs over the parent class loader when loading classes and resources.
ChildFirstURLClassLoader(URL[], ClassLoader) - Constructor for class org.apache.spark.util.ChildFirstURLClassLoader
chiSqFunc() - Method in class org.apache.spark.mllib.stat.test.ChiSqTest.Method
ChiSqSelector - Class in org.apache.spark.ml.feature: Chi-Squared feature selection, which selects categorical features to use for predicting a categorical label.
ChiSqSelector(String) - Constructor for class org.apache.spark.ml.feature.ChiSqSelector
ChiSqSelector() - Constructor for class org.apache.spark.ml.feature.ChiSqSelector
ChiSqSelector - Class in org.apache.spark.mllib.feature: Creates a ChiSquared feature selector.
ChiSqSelector() - Constructor for class org.apache.spark.mllib.feature.ChiSqSelector
ChiSqSelector(int) - Constructor for class org.apache.spark.mllib.feature.ChiSqSelector: The is the same to call this() and setNumTopFeatures(numTopFeatures)
ChiSqSelectorModel - Class in org.apache.spark.ml.feature: Model fitted by ChiSqSelector.
ChiSqSelectorModel - Class in org.apache.spark.mllib.feature: Chi Squared selector model.
ChiSqSelectorModel(int[]) - Constructor for class org.apache.spark.mllib.feature.ChiSqSelectorModel
ChiSqSelectorModel.SaveLoadV1_0$ - Class in org.apache.spark.mllib.feature
ChiSqSelectorModel.SaveLoadV1_0$.Data - Class in org.apache.spark.mllib.feature: Model data for import/export
ChiSqSelectorModel.SaveLoadV1_0$.Data$ - Class in org.apache.spark.mllib.feature
ChiSqSelectorParams - Interface in org.apache.spark.ml.feature: Params for ChiSqSelector and ChiSqSelectorModel.
chiSqTest(Vector, Vector) - Static method in class org.apache.spark.mllib.stat.Statistics: Conduct Pearson's chi-squared goodness of fit test of the observed data against the expected distribution.
chiSqTest(Vector) - Static method in class org.apache.spark.mllib.stat.Statistics: Conduct Pearson's chi-squared goodness of fit test of the observed data against the uniform distribution, with each category having an expected frequency of 1 / observed.size.
chiSqTest(Matrix) - Static method in class org.apache.spark.mllib.stat.Statistics: Conduct Pearson's independence test on the input contingency matrix, which cannot contain negative entries or columns or rows that sum up to 0.
chiSqTest(RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.stat.Statistics: Conduct Pearson's independence test for every feature against the label across the input RDD.
chiSqTest(JavaRDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.stat.Statistics: Java-friendly version of chiSqTest()
ChiSqTest - Class in org.apache.spark.mllib.stat.test: Conduct the chi-squared test for the input RDDs using the specified method.
ChiSqTest() - Constructor for class org.apache.spark.mllib.stat.test.ChiSqTest
ChiSqTest.Method - Class in org.apache.spark.mllib.stat.test: param: name String name for the method.
ChiSqTest.Method$ - Class in org.apache.spark.mllib.stat.test
ChiSqTest.NullHypothesis$ - Class in org.apache.spark.mllib.stat.test
ChiSqTestResult - Class in org.apache.spark.mllib.stat.test: Object containing the test results for the chi-squared hypothesis test.
chiSquared(Vector, Vector, String) - Static method in class org.apache.spark.mllib.stat.test.ChiSqTest
chiSquaredFeatures(RDD<LabeledPoint>, String) - Static method in class org.apache.spark.mllib.stat.test.ChiSqTest: Conduct Pearson's independence test for each feature against the label across the input RDD.
chiSquaredMatrix(Matrix, String) - Static method in class org.apache.spark.mllib.stat.test.ChiSqTest
ChiSquareTest - Class in org.apache.spark.ml.stat: :: Experimental ::
ChiSquareTest() - Constructor for class org.apache.spark.ml.stat.ChiSquareTest
chmod700(File) - Static method in class org.apache.spark.util.Utils: JDK equivalent of chmod 700 file.
CholeskyDecomposition - Class in org.apache.spark.mllib.linalg: Compute Cholesky decomposition.
CholeskyDecomposition() - Constructor for class org.apache.spark.mllib.linalg.CholeskyDecomposition
cipherStream() - Method in interface org.apache.spark.security.CryptoStreamUtils.BaseErrorHandler: The encrypted stream that may get into an unhealthy state.
classForName(String) - Static method in class org.apache.spark.util.Utils: Preferred alternative to Class.forName(className)
Classification() - Static method in class org.apache.spark.mllib.tree.configuration.Algo
ClassificationLoss - Interface in org.apache.spark.mllib.tree.loss
ClassificationModel<FeaturesType,M extends ClassificationModel<FeaturesType,M>> - Class in org.apache.spark.ml.classification: :: DeveloperApi ::
ClassificationModel() - Constructor for class org.apache.spark.ml.classification.ClassificationModel
ClassificationModel - Interface in org.apache.spark.mllib.classification: Represents a classification model that predicts to which of a set of categories an example belongs.
Classifier<FeaturesType,E extends Classifier<FeaturesType,E,M>,M extends ClassificationModel<FeaturesType,M>> - Class in org.apache.spark.ml.classification: :: DeveloperApi ::
Classifier() - Constructor for class org.apache.spark.ml.classification.Classifier
classifier() - Method in interface org.apache.spark.ml.classification.OneVsRestParams: param for the base binary classifier that we reduce multiclass classification into.
ClassifierParams - Interface in org.apache.spark.ml.classification: (private[spark]) Params for classification.
ClassifierTypeTrait - Interface in org.apache.spark.ml.classification
classIsLoadable(String) - Static method in class org.apache.spark.util.Utils: Determines whether the provided class is loadable in the current thread.
className() - Method in class org.apache.spark.ExceptionFailure
className() - Static method in class org.apache.spark.ml.linalg.JsonMatrixConverter: Unique class name for identifying JSON object encoded by this class.
className() - Method in class org.apache.spark.sql.catalog.Function
classpathEntries() - Method in class org.apache.spark.status.api.v1.ApplicationEnvironmentInfo
classTag() - Method in class org.apache.spark.api.java.JavaDoubleRDD
classTag() - Method in class org.apache.spark.api.java.JavaPairRDD
classTag() - Method in class org.apache.spark.api.java.JavaRDD
classTag() - Method in interface org.apache.spark.api.java.JavaRDDLike
classTag() - Method in class org.apache.spark.sql.Dataset
classTag() - Method in class org.apache.spark.storage.memory.DeserializedMemoryEntry
classTag() - Method in interface org.apache.spark.storage.memory.MemoryEntry
classTag() - Method in class org.apache.spark.storage.memory.SerializedMemoryEntry
classTag() - Method in class org.apache.spark.streaming.api.java.JavaDStream
classTag() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
classTag() - Method in class org.apache.spark.streaming.api.java.JavaInputDStream
classTag() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
classTag() - Method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
clean(long, boolean) - Method in class org.apache.spark.streaming.util.WriteAheadLog: Clean all the records that are older than the threshold time.
clean(Object, boolean, boolean) - Static method in class org.apache.spark.util.ClosureCleaner: Clean the given closure in place.
CleanAccum - Class in org.apache.spark
CleanAccum(long) - Constructor for class org.apache.spark.CleanAccum
CleanBroadcast - Class in org.apache.spark
CleanBroadcast(long) - Constructor for class org.apache.spark.CleanBroadcast
CleanCheckpoint - Class in org.apache.spark
CleanCheckpoint(int) - Constructor for class org.apache.spark.CleanCheckpoint
CleanerListener - Interface in org.apache.spark: Listener class used for testing when any item has been cleaned by the Cleaner class.
cleaning() - Method in class org.apache.spark.status.LiveStage
CleanRDD - Class in org.apache.spark
CleanRDD(int) - Constructor for class org.apache.spark.CleanRDD
CleanShuffle - Class in org.apache.spark
CleanShuffle(int) - Constructor for class org.apache.spark.CleanShuffle
cleanupOldBlocks(long) - Method in interface org.apache.spark.streaming.receiver.ReceivedBlockHandler: Cleanup old blocks older than the given threshold time
CleanupTask - Interface in org.apache.spark: Classes that represent cleaning tasks.
CleanupTaskWeakReference - Class in org.apache.spark: A WeakReference associated with a CleanupTask.
CleanupTaskWeakReference(CleanupTask, Object, ReferenceQueue<Object>) - Constructor for class org.apache.spark.CleanupTaskWeakReference
clear(Param<?>) - Method in interface org.apache.spark.ml.param.Params: Clears the user-supplied value for the input param.
clear() - Method in class org.apache.spark.sql.util.ExecutionListenerManager: Removes all the registered QueryExecutionListener.
clear() - Static method in class org.apache.spark.util.AccumulatorContext: Clears all registered AccumulatorV2s.
clearActive() - Static method in class org.apache.spark.sql.SQLContext: Deprecated.
Use SparkSession.clearActiveSession instead. Since 2.0.0.
clearActiveSession() - Static method in class org.apache.spark.sql.SparkSession: Clears the active SparkSession for current thread.
clearCache() - Method in class org.apache.spark.sql.catalog.Catalog: Removes all cached tables from the in-memory cache.
clearCache() - Method in class org.apache.spark.sql.SQLContext: Removes all cached tables from the in-memory cache.
clearCallSite() - Method in class org.apache.spark.api.java.JavaSparkContext: Pass-through to SparkContext.setCallSite.
clearCallSite() - Method in class org.apache.spark.SparkContext: Clear the thread-local property for overriding the call sites of actions and RDDs.
clearDefaultSession() - Static method in class org.apache.spark.sql.SparkSession: Clears the default SparkSession that is returned by the builder.
clearDependencies() - Method in class org.apache.spark.rdd.CoGroupedRDD
clearDependencies() - Method in class org.apache.spark.rdd.ShuffledRDD
clearDependencies() - Method in class org.apache.spark.rdd.UnionRDD
clearJobGroup() - Method in class org.apache.spark.api.java.JavaSparkContext: Clear the current thread's job group ID and its description.
clearJobGroup() - Method in class org.apache.spark.SparkContext: Clear the current thread's job group ID and its description.
clearThreshold() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel: Clears the threshold so that predict will output raw prediction scores.
clearThreshold() - Method in class org.apache.spark.mllib.classification.SVMModel: Clears the threshold so that predict will output raw prediction scores.
Clock - Interface in org.apache.spark.util: An interface to represent clocks, so that they can be mocked out in unit tests.
CLogLog$() - Constructor for class org.apache.spark.ml.regression.GeneralizedLinearRegression.CLogLog$
clone() - Method in class org.apache.spark.SparkConf: Copy this object
clone() - Method in class org.apache.spark.sql.ExperimentalMethods
clone() - Method in class org.apache.spark.sql.types.Decimal
clone() - Method in class org.apache.spark.sql.util.ExecutionListenerManager: Get an identical copy of this listener manager.
clone() - Method in class org.apache.spark.storage.StorageLevel
clone() - Method in class org.apache.spark.util.random.BernoulliCellSampler
clone() - Method in class org.apache.spark.util.random.BernoulliSampler
clone() - Method in class org.apache.spark.util.random.PoissonSampler
clone() - Method in interface org.apache.spark.util.random.RandomSampler: return a copy of the RandomSampler object
clone(T, SerializerInstance, ClassTag<T>) - Static method in class org.apache.spark.util.Utils: Clone an object using a Spark serializer.
cloneComplement() - Method in class org.apache.spark.util.random.BernoulliCellSampler: Return a sampler that is the complement of the range specified of the current sampler.
cloneProperties(Properties) - Static method in class org.apache.spark.util.Utils: Create a new properties object with the same values as `props`
close() - Method in class org.apache.spark.api.java.JavaSparkContext
close() - Method in class org.apache.spark.io.NioBufferedFileInputStream
close() - Method in class org.apache.spark.io.ReadAheadInputStream
close() - Method in class org.apache.spark.io.SnappyOutputStreamWrapper
close() - Method in interface org.apache.spark.security.CryptoStreamUtils.BaseErrorHandler
close() - Method in class org.apache.spark.serializer.DeserializationStream
close() - Method in class org.apache.spark.serializer.SerializationStream
close(Throwable) - Method in class org.apache.spark.sql.ForeachWriter: Called when stopping to process one partition of new data in the executor side.
close() - Method in class org.apache.spark.sql.hive.execution.HiveOutputWriter
close() - Method in class org.apache.spark.sql.SparkSession: Synonym for stop().
close() - Method in class org.apache.spark.sql.vectorized.ArrowColumnVector
close() - Method in class org.apache.spark.sql.vectorized.ColumnarBatch: Called to close all the columns in this batch.
close() - Method in class org.apache.spark.sql.vectorized.ColumnVector: Cleans up memory for this column vector.
close() - Method in class org.apache.spark.storage.BufferReleasingInputStream
close() - Method in class org.apache.spark.storage.CountingWritableChannel
close() - Method in class org.apache.spark.storage.TimeTrackingOutputStream
close() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
close() - Method in class org.apache.spark.streaming.util.WriteAheadLog: Close this log and release any resources.
closed() - Method in interface org.apache.spark.security.CryptoStreamUtils.BaseErrorHandler
closeWriter(TaskAttemptContext) - Method in class org.apache.spark.internal.io.HadoopWriteConfigUtil
ClosureCleaner - Class in org.apache.spark.util: A cleaner that renders closures serializable if they can be done so safely.
ClosureCleaner() - Constructor for class org.apache.spark.util.ClosureCleaner
closureSerializer() - Method in class org.apache.spark.SparkEnv
cls() - Method in class org.apache.spark.sql.types.ObjectType
cls() - Method in class org.apache.spark.util.MethodIdentifier
clsTag() - Method in interface org.apache.spark.sql.Encoder: A ClassTag that can be used to construct an Array to contain a collection of T.
cluster() - Method in class org.apache.spark.ml.clustering.ClusteringSummary: Cluster centers of the transformed data.
cluster() - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering.Assignment
clusterCenter() - Method in class org.apache.spark.ml.clustering.ClusterData
clusterCenters() - Method in class org.apache.spark.ml.clustering.BisectingKMeansModel
clusterCenters() - Method in class org.apache.spark.ml.clustering.KMeansModel
clusterCenters() - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel: Leaf cluster centers.
clusterCenters() - Method in class org.apache.spark.mllib.clustering.KMeansModel
clusterCenters() - Method in class org.apache.spark.mllib.clustering.StreamingKMeansModel
ClusterData - Class in org.apache.spark.ml.clustering: Helper class for storing model data
ClusterData(int, Vector) - Constructor for class org.apache.spark.ml.clustering.ClusterData
clusteredColumns - Variable in class org.apache.spark.sql.sources.v2.reader.partitioning.ClusteredDistribution: The names of the clustered columns.
ClusteredDistribution - Class in org.apache.spark.sql.sources.v2.reader.partitioning: A concrete implementation of Distribution.
ClusteredDistribution(String[]) - Constructor for class org.apache.spark.sql.sources.v2.reader.partitioning.ClusteredDistribution
clusterIdx() - Method in class org.apache.spark.ml.clustering.ClusterData
ClusteringEvaluator - Class in org.apache.spark.ml.evaluation: :: Experimental ::
ClusteringEvaluator(String) - Constructor for class org.apache.spark.ml.evaluation.ClusteringEvaluator
ClusteringEvaluator() - Constructor for class org.apache.spark.ml.evaluation.ClusteringEvaluator
ClusteringSummary - Class in org.apache.spark.ml.clustering: :: Experimental :: Summary of clustering algorithms.
clusterSizes() - Method in class org.apache.spark.ml.clustering.ClusteringSummary: Size of (number of data points in) each cluster.
ClusterStats(Vector, double, long) - Constructor for class org.apache.spark.ml.evaluation.SquaredEuclideanSilhouette.ClusterStats
ClusterStats$() - Constructor for class org.apache.spark.ml.evaluation.SquaredEuclideanSilhouette.ClusterStats$
clusterWeights() - Method in class org.apache.spark.mllib.clustering.StreamingKMeansModel
cn() - Method in class org.apache.spark.mllib.feature.VocabWord
coalesce(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a new RDD that is reduced into numPartitions partitions.
coalesce(int) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a new RDD that is reduced into numPartitions partitions.
coalesce(int) - Method in class org.apache.spark.api.java.JavaRDD: Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaRDD: Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, RDD<?>) - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer: Runs the packing algorithm and returns an array of PartitionGroups that if possible are load balanced and grouped by locality
coalesce(int, RDD<?>) - Method in interface org.apache.spark.rdd.PartitionCoalescer: Coalesce the partitions of the given RDD.
coalesce(int, boolean, Option<PartitionCoalescer>, Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Return a new RDD that is reduced into numPartitions partitions.
coalesce(int) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset that has exactly numPartitions partitions, when the fewer partitions are requested.
coalesce(Column...) - Static method in class org.apache.spark.sql.functions: Returns the first column that is not null, or null if all inputs are null.
coalesce(Seq<Column>) - Static method in class org.apache.spark.sql.functions: Returns the first column that is not null, or null if all inputs are null.
CoarseGrainedClusterMessage - Interface in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages
CoarseGrainedClusterMessages.AddWebUIFilter - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.AddWebUIFilter$ - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.GetExecutorLossReason - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.GetExecutorLossReason$ - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.KillExecutors - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.KillExecutors$ - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.KillExecutorsOnHost - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.KillExecutorsOnHost$ - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.KillTask - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.KillTask$ - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.LaunchTask - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.LaunchTask$ - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.RegisterClusterManager - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.RegisterClusterManager$ - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.RegisteredExecutor$ - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.RegisterExecutor - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.RegisterExecutor$ - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.RegisterExecutorFailed - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.RegisterExecutorFailed$ - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.RegisterExecutorResponse - Interface in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.RemoveExecutor - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.RemoveExecutor$ - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.RemoveWorker - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.RemoveWorker$ - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.RequestExecutors - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.RequestExecutors$ - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.RetrieveLastAllocatedExecutorId$ - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.RetrieveSparkAppConfig$ - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.ReviveOffers$ - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.SetupDriver - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.SetupDriver$ - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.Shutdown$ - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.SparkAppConfig - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.SparkAppConfig$ - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.StatusUpdate - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.StatusUpdate$ - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.StopDriver$ - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.StopExecutor$ - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.StopExecutors$ - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.UpdateDelegationTokens - Class in org.apache.spark.scheduler.cluster
CoarseGrainedClusterMessages.UpdateDelegationTokens$ - Class in org.apache.spark.scheduler.cluster
code() - Method in class org.apache.spark.mllib.feature.VocabWord
CodegenMetrics - Class in org.apache.spark.metrics.source: :: Experimental :: Metrics for code generation.
CodegenMetrics() - Constructor for class org.apache.spark.metrics.source.CodegenMetrics
codeLen() - Method in class org.apache.spark.mllib.feature.VocabWord
coefficientMatrix() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
coefficients() - Method in class org.apache.spark.ml.classification.LinearSVCModel
coefficients() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel: A vector of model coefficients for "binomial" logistic regression.
coefficients() - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
coefficients() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionModel
coefficients() - Method in class org.apache.spark.ml.regression.LinearRegressionModel
coefficientStandardErrors() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionTrainingSummary: Standard error of estimated coefficients and intercept.
coefficientStandardErrors() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary: Standard error of estimated coefficients and intercept.
cogroup(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions: For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions: For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions: For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(KeyValueGroupedDataset<K, U>, Function3<K, Iterator<V>, Iterator, TraversableOnce<R>>, Encoder<R>) - Method in class org.apache.spark.sql.KeyValueGroupedDataset: (Scala-specific) Applies the given function to each cogrouped data.
cogroup(KeyValueGroupedDataset<K, U>, CoGroupFunction<K, V, U, R>, Encoder<R>) - Method in class org.apache.spark.sql.KeyValueGroupedDataset: (Java-specific) Applies the given function to each cogrouped data.
cogroup(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
CoGroupedRDD<K> - Class in org.apache.spark.rdd: :: DeveloperApi :: An RDD that cogroups its parents.
CoGroupedRDD(Seq<RDD<? extends Product2<K, ?>>>, Partitioner, ClassTag<K>) - Constructor for class org.apache.spark.rdd.CoGroupedRDD
CoGroupFunction<K,V1,V2,R> - Interface in org.apache.spark.api.java.function: A function that returns zero or more output records from each grouping key and its values from 2 Datasets.
col(String) - Method in class org.apache.spark.sql.Dataset: Selects column based on the column name and returns it as a Column.
col(String) - Static method in class org.apache.spark.sql.functions: Returns a Column based on the given column name.
coldStartStrategy() - Method in interface org.apache.spark.ml.recommendation.ALSModelParams: Param for strategy for dealing with unknown or new users/items at prediction time.
colIter() - Method in class org.apache.spark.ml.linalg.DenseMatrix
colIter() - Method in interface org.apache.spark.ml.linalg.Matrix: Returns an iterator of column vectors.
colIter() - Method in class org.apache.spark.ml.linalg.SparseMatrix
colIter() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
colIter() - Method in interface org.apache.spark.mllib.linalg.Matrix: Returns an iterator of column vectors.
colIter() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
collect() - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an array that contains all of the elements in this RDD.
collect() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
collect() - Method in class org.apache.spark.rdd.RDD: Return an array that contains all of the elements in this RDD.
collect(PartialFunction<T, U>, ClassTag) - Method in class org.apache.spark.rdd.RDD: Return an RDD that contains all matching values by applying f.
collect() - Method in class org.apache.spark.sql.Dataset: Returns an array that contains all rows in this Dataset.
collect_list(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns a list of objects with duplicates.
collect_list(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns a list of objects with duplicates.
collect_set(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns a set of objects with duplicate elements eliminated.
collect_set(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns a set of objects with duplicate elements eliminated.
collectAsList() - Method in class org.apache.spark.sql.Dataset: Returns a Java list that contains all rows in this Dataset.
collectAsMap() - Method in class org.apache.spark.api.java.JavaPairRDD: Return the key-value pairs in this RDD to the master as a Map.
collectAsMap() - Method in class org.apache.spark.rdd.PairRDDFunctions: Return the key-value pairs in this RDD to the master as a Map.
collectAsync() - Method in interface org.apache.spark.api.java.JavaRDDLike: The asynchronous version of collect, which returns a future for retrieving an array containing all of the elements in this RDD.
collectAsync() - Method in class org.apache.spark.rdd.AsyncRDDActions: Returns a future for retrieving all elements of this RDD.
collectEdges(EdgeDirection) - Method in class org.apache.spark.graphx.GraphOps: Returns an RDD that contains for each vertex v its local edges, i.e., the edges that are incident on v, in the user-specified direction.
collectionAccumulator() - Method in class org.apache.spark.SparkContext: Create and register a CollectionAccumulator, which starts with empty list and accumulates inputs by adding them into the list.
collectionAccumulator(String) - Method in class org.apache.spark.SparkContext: Create and register a CollectionAccumulator, which starts with empty list and accumulates inputs by adding them into the list.
CollectionAccumulator<T> - Class in org.apache.spark.util: An accumulator for collecting a list of elements.
CollectionAccumulator() - Constructor for class org.apache.spark.util.CollectionAccumulator
CollectionsUtils - Class in org.apache.spark.util
CollectionsUtils() - Constructor for class org.apache.spark.util.CollectionsUtils
collectNeighborIds(EdgeDirection) - Method in class org.apache.spark.graphx.GraphOps: Collect the neighbor vertex ids for each vertex.
collectNeighbors(EdgeDirection) - Method in class org.apache.spark.graphx.GraphOps: Collect the neighbor vertex attributes for each vertex.
collectPartitions(int[]) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an array that contains all of the elements in a specific partition of this RDD.
collectSubModels() - Method in interface org.apache.spark.ml.param.shared.HasCollectSubModels: Param for whether to collect a list of sub-models trained during tuning.
colPtrs() - Method in class org.apache.spark.ml.linalg.SparseMatrix
colPtrs() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
colRegex(String) - Method in class org.apache.spark.sql.Dataset: Selects column based on the column name specified as a regex and returns it as Column.
colsPerBlock() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
colStats(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.Statistics: Computes column-wise summary statistics for the input RDD[Vector].
Column - Class in org.apache.spark.sql.catalog: A column in Spark, as returned by listColumns method in Catalog.
Column(String, String, String, boolean, boolean, boolean) - Constructor for class org.apache.spark.sql.catalog.Column
Column - Class in org.apache.spark.sql: A column that will be computed based on the data in a DataFrame.
Column(Expression) - Constructor for class org.apache.spark.sql.Column
Column(String) - Constructor for class org.apache.spark.sql.Column
column(String) - Static method in class org.apache.spark.sql.functions: Returns a Column based on the given column name.
column(int) - Method in class org.apache.spark.sql.vectorized.ColumnarBatch: Returns the column at `ordinal`.
ColumnarArray - Class in org.apache.spark.sql.vectorized: Array abstraction in ColumnVector.
ColumnarArray(ColumnVector, int, int) - Constructor for class org.apache.spark.sql.vectorized.ColumnarArray
ColumnarBatch - Class in org.apache.spark.sql.vectorized: This class wraps multiple ColumnVectors as a row-wise table.
ColumnarBatch(ColumnVector[]) - Constructor for class org.apache.spark.sql.vectorized.ColumnarBatch
ColumnarMap - Class in org.apache.spark.sql.vectorized: Map abstraction in ColumnVector.
ColumnarMap(ColumnVector, ColumnVector, int, int) - Constructor for class org.apache.spark.sql.vectorized.ColumnarMap
ColumnarRow - Class in org.apache.spark.sql.vectorized: Row abstraction in ColumnVector.
ColumnarRow(ColumnVector, int) - Constructor for class org.apache.spark.sql.vectorized.ColumnarRow
ColumnName - Class in org.apache.spark.sql: A convenient class used for constructing schema.
ColumnName(String) - Constructor for class org.apache.spark.sql.ColumnName
ColumnPruner - Class in org.apache.spark.ml.feature: Utility transformer for removing temporary columns from a DataFrame.
ColumnPruner(String, Set<String>) - Constructor for class org.apache.spark.ml.feature.ColumnPruner
ColumnPruner(Set<String>) - Constructor for class org.apache.spark.ml.feature.ColumnPruner
columns() - Method in class org.apache.spark.sql.Dataset: Returns all column names as an array.
columnSchema() - Static method in class org.apache.spark.ml.image.ImageSchema: Schema for the image column: Row(String, Int, Int, Int, Int, Array[Byte])
columnSimilarities() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix: Compute all cosine similarities between columns of this matrix using the brute-force approach of computing normalized dot products.
columnSimilarities() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Compute all cosine similarities between columns of this matrix using the brute-force approach of computing normalized dot products.
columnSimilarities(double) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Compute similarities between columns of this matrix using a sampling approach.
columnsToPrune() - Method in class org.apache.spark.ml.feature.ColumnPruner
columnToOldVector(Dataset<?>, String) - Static method in class org.apache.spark.ml.util.DatasetUtils
columnToVector(Dataset<?>, String) - Static method in class org.apache.spark.ml.util.DatasetUtils: Cast a column in a Dataset to Vector type.
ColumnVector - Class in org.apache.spark.sql.vectorized: An interface representing in-memory columnar data in Spark.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, Serializer) - Method in class org.apache.spark.api.java.JavaPairRDD: Generic function to combine the elements for each key using a custom set of aggregation functions.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Generic function to combine the elements for each key using a custom set of aggregation functions.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Simplified version of combineByKey that hash-partitions the output RDD and uses map-side aggregation.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Method in class org.apache.spark.api.java.JavaPairRDD: Simplified version of combineByKey that hash-partitions the resulting RDD using the existing partitioner/parallelism level and using map-side aggregation.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, Serializer) - Method in class org.apache.spark.rdd.PairRDDFunctions: Generic function to combine the elements for each key using a custom set of aggregation functions.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: Simplified version of combineByKeyWithClassTag that hash-partitions the output RDD.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Simplified version of combineByKeyWithClassTag that hash-partitions the resulting RDD using the existing partitioner/parallelism level.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Combine elements of each key in DStream's RDDs using custom function.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Combine elements of each key in DStream's RDDs using custom function.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, ClassTag<C>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Combine elements of each key in DStream's RDDs using custom functions.
combineByKeyWithClassTag(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, Serializer, ClassTag<C>) - Method in class org.apache.spark.rdd.PairRDDFunctions: :: Experimental :: Generic function to combine the elements for each key using a custom set of aggregation functions.
combineByKeyWithClassTag(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, int, ClassTag<C>) - Method in class org.apache.spark.rdd.PairRDDFunctions: :: Experimental :: Simplified version of combineByKeyWithClassTag that hash-partitions the output RDD.
combineByKeyWithClassTag(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, ClassTag<C>) - Method in class org.apache.spark.rdd.PairRDDFunctions: :: Experimental :: Simplified version of combineByKeyWithClassTag that hash-partitions the resulting RDD using the existing partitioner/parallelism level.
combineCombinersByKey(Iterator<? extends Product2<K, C>>, TaskContext) - Method in class org.apache.spark.Aggregator
combineValuesByKey(Iterator<? extends Product2<K, V>>, TaskContext) - Method in class org.apache.spark.Aggregator
CommandLineUtils - Interface in org.apache.spark.util: Contains basic command line parsing functionality and methods to parse some common Spark CLI options.
commit(Function0<Parsers.Parser<T>>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
commit(Offset) - Method in interface org.apache.spark.sql.sources.v2.reader.streaming.ContinuousReader: Informs the source that Spark has completed processing all data for offsets less than or equal to `end` and will only request offsets greater than `end` in the future.
commit(Offset) - Method in interface org.apache.spark.sql.sources.v2.reader.streaming.MicroBatchReader: Informs the source that Spark has completed processing all data for offsets less than or equal to `end` and will only request offsets greater than `end` in the future.
commit(WriterCommitMessage[]) - Method in interface org.apache.spark.sql.sources.v2.writer.DataSourceWriter: Commits this writing job with a list of commit messages.
commit() - Method in interface org.apache.spark.sql.sources.v2.writer.DataWriter: Commits this writer after all records are written successfully, returns a commit message which will be sent back to driver side and passed to DataSourceWriter.commit(WriterCommitMessage[]).
commit(long, WriterCommitMessage[]) - Method in interface org.apache.spark.sql.sources.v2.writer.streaming.StreamWriter: Commits this writing job for the specified epoch with a list of commit messages.
commit(WriterCommitMessage[]) - Method in interface org.apache.spark.sql.sources.v2.writer.streaming.StreamWriter
commitJob(JobContext, Seq<FileCommitProtocol.TaskCommitMessage>) - Method in class org.apache.spark.internal.io.FileCommitProtocol: Commits a job after the writes succeed.
commitJob(JobContext, Seq<FileCommitProtocol.TaskCommitMessage>) - Method in class org.apache.spark.internal.io.HadoopMapReduceCommitProtocol
commitTask(TaskAttemptContext) - Method in class org.apache.spark.internal.io.FileCommitProtocol: Commits a task after the writes succeed.
commitTask(TaskAttemptContext) - Method in class org.apache.spark.internal.io.HadoopMapReduceCommitProtocol
commitTask(OutputCommitter, TaskAttemptContext, int, int) - Static method in class org.apache.spark.mapred.SparkHadoopMapRedUtil: Commits a task output.
commonHeaderNodes(HttpServletRequest) - Static method in class org.apache.spark.ui.UIUtils
comparator(Schedulable, Schedulable) - Method in interface org.apache.spark.scheduler.SchedulingAlgorithm
compare(PartitionGroup, PartitionGroup) - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer
compare(Option<PartitionGroup>, Option<PartitionGroup>) - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer
compare(Decimal) - Method in class org.apache.spark.sql.types.Decimal
compare(Decimal, Decimal) - Method in interface org.apache.spark.sql.types.Decimal.DecimalIsConflicted
compare(RDDInfo) - Method in class org.apache.spark.storage.RDDInfo
compareTo(SparkShutdownHook) - Method in class org.apache.spark.util.SparkShutdownHook
compileValue(Object) - Static method in class org.apache.spark.sql.jdbc.DB2Dialect
compileValue(Object) - Static method in class org.apache.spark.sql.jdbc.DerbyDialect
compileValue(Object) - Method in class org.apache.spark.sql.jdbc.JdbcDialect: Converts value to SQL expression.
compileValue(Object) - Static method in class org.apache.spark.sql.jdbc.MsSqlServerDialect
compileValue(Object) - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
compileValue(Object) - Static method in class org.apache.spark.sql.jdbc.NoopDialect
compileValue(Object) - Static method in class org.apache.spark.sql.jdbc.OracleDialect
compileValue(Object) - Static method in class org.apache.spark.sql.jdbc.PostgresDialect
compileValue(Object) - Static method in class org.apache.spark.sql.jdbc.TeradataDialect
Complete() - Static method in class org.apache.spark.sql.streaming.OutputMode: OutputMode in which all the rows in the streaming DataFrame/Dataset will be written to the sink every time there are some updates.
completed() - Method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
completedIndices() - Method in class org.apache.spark.status.LiveJob
completedIndices() - Method in class org.apache.spark.status.LiveStage
completedStages() - Method in class org.apache.spark.status.LiveJob
completedTasks() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
completedTasks() - Method in class org.apache.spark.status.LiveExecutor
completedTasks() - Method in class org.apache.spark.status.LiveJob
completedTasks() - Method in class org.apache.spark.status.LiveStage
COMPLETION_TIME() - Static method in class org.apache.spark.status.TaskIndexNames
completionTime() - Method in class org.apache.spark.scheduler.StageInfo: Time when all tasks in the stage completed or when the stage was cancelled.
completionTime() - Method in class org.apache.spark.status.api.v1.JobData
completionTime() - Method in class org.apache.spark.status.api.v1.StageData
completionTime() - Method in class org.apache.spark.status.LiveJob
ComplexFutureAction<T> - Class in org.apache.spark: A FutureAction for actions that could trigger multiple Spark jobs.
ComplexFutureAction(Function1<JobSubmitter, Future<T>>) - Constructor for class org.apache.spark.ComplexFutureAction
compressed() - Method in interface org.apache.spark.ml.linalg.Matrix: Returns a matrix in dense column major, dense row major, sparse row major, or sparse column major format, whichever uses less storage.
compressed() - Method in interface org.apache.spark.ml.linalg.Vector: Returns a vector in either dense or sparse format, whichever uses less storage.
compressed() - Method in interface org.apache.spark.mllib.linalg.Vector: Returns a vector in either dense or sparse format, whichever uses less storage.
compressedColMajor() - Method in interface org.apache.spark.ml.linalg.Matrix: Returns a matrix in dense or sparse column major format, whichever uses less storage.
compressedInputStream(InputStream) - Method in interface org.apache.spark.io.CompressionCodec
compressedInputStream(InputStream) - Method in class org.apache.spark.io.LZ4CompressionCodec
compressedInputStream(InputStream) - Method in class org.apache.spark.io.LZFCompressionCodec
compressedInputStream(InputStream) - Method in class org.apache.spark.io.SnappyCompressionCodec
compressedInputStream(InputStream) - Method in class org.apache.spark.io.ZStdCompressionCodec
compressedOutputStream(OutputStream) - Method in interface org.apache.spark.io.CompressionCodec
compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.LZ4CompressionCodec
compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.LZFCompressionCodec
compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.SnappyCompressionCodec
compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.ZStdCompressionCodec
compressedRowMajor() - Method in interface org.apache.spark.ml.linalg.Matrix: Returns a matrix in dense or sparse row major format, whichever uses less storage.
CompressionCodec - Interface in org.apache.spark.io: :: DeveloperApi :: CompressionCodec allows the customization of choosing different compression implementations to be used in block storage.
compute(Partition, TaskContext) - Method in class org.apache.spark.api.r.BaseRRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.graphx.EdgeRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.graphx.VertexRDD: Provides the RDD[(VertexId, VD)] equivalent output.
compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.Gradient: Compute the gradient and loss given the features of a single data point.
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.Gradient: Compute the gradient and loss given the features of a single data point, add the gradient to a provided vector to avoid creating new objects, and return loss.
compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.HingeGradient
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.HingeGradient
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.L1Updater
compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.LeastSquaresGradient
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.LeastSquaresGradient
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.LogisticGradient
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.SimpleUpdater
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.SquaredL2Updater
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.Updater: Compute an updated value for weights given the gradient, stepSize, iteration number and regularization parameter.
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.CoGroupedRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.HadoopRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.JdbcRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.NewHadoopRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.PartitionPruningRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.RDD: :: DeveloperApi :: Implemented by subclasses to compute a given partition.
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ShuffledRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.UnionRDD
compute(Time) - Method in class org.apache.spark.streaming.api.java.JavaDStream: Generate an RDD for the given duration
compute(Time) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Method that generates an RDD for the given Duration
compute(Time) - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
compute(Time) - Method in class org.apache.spark.streaming.dstream.DStream: Method that generates an RDD for the given time
compute(Time) - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
compute(long, long, long, long) - Method in interface org.apache.spark.streaming.scheduler.rate.RateEstimator: Computes the number of records the stream attached to this RateEstimator should ingest per second, given an update on the size and completion times of the latest batch.
computeClusterStats(Dataset<Row>, String, String) - Static method in class org.apache.spark.ml.evaluation.CosineSilhouette: The method takes the input dataset and computes the aggregated values about a cluster which are needed by the algorithm.
computeClusterStats(Dataset<Row>, String, String) - Static method in class org.apache.spark.ml.evaluation.SquaredEuclideanSilhouette: The method takes the input dataset and computes the aggregated values about a cluster which are needed by the algorithm.
computeColumnSummaryStatistics() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Computes column-wise summary statistics.
computeCorrelation(RDD<Object>, RDD<Object>) - Method in interface org.apache.spark.mllib.stat.correlation.Correlation: Compute correlation for two datasets.
computeCorrelation(RDD<Object>, RDD<Object>) - Static method in class org.apache.spark.mllib.stat.correlation.PearsonCorrelation: Compute the Pearson correlation for two datasets.
computeCorrelation(RDD<Object>, RDD<Object>) - Static method in class org.apache.spark.mllib.stat.correlation.SpearmanCorrelation: Compute Spearman's correlation for two datasets.
computeCorrelationMatrix(RDD<Vector>) - Method in interface org.apache.spark.mllib.stat.correlation.Correlation: Compute the correlation matrix S, for the input matrix, where S(i, j) is the correlation between column i and j.
computeCorrelationMatrix(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.correlation.PearsonCorrelation: Compute the Pearson correlation matrix S, for the input matrix, where S(i, j) is the correlation between column i and j.
computeCorrelationMatrix(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.correlation.SpearmanCorrelation: Compute Spearman's correlation matrix S, for the input matrix, where S(i, j) is the correlation between column i and j.
computeCorrelationMatrixFromCovariance(Matrix) - Static method in class org.apache.spark.mllib.stat.correlation.PearsonCorrelation: Compute the Pearson correlation matrix from the covariance matrix.
computeCorrelationWithMatrixImpl(RDD<Object>, RDD<Object>) - Method in interface org.apache.spark.mllib.stat.correlation.Correlation: Combine the two input RDD[Double]s into an RDD[Vector] and compute the correlation using the correlation implementation for RDD[Vector].
computeCorrelationWithMatrixImpl(RDD<Object>, RDD<Object>) - Static method in class org.apache.spark.mllib.stat.correlation.PearsonCorrelation
computeCorrelationWithMatrixImpl(RDD<Object>, RDD<Object>) - Static method in class org.apache.spark.mllib.stat.correlation.SpearmanCorrelation
computeCost(Dataset<?>) - Method in class org.apache.spark.ml.clustering.BisectingKMeansModel: Computes the sum of squared distances between the input points and their corresponding cluster centers.
computeCost(Dataset<?>) - Method in class org.apache.spark.ml.clustering.KMeansModel: Deprecated.
This method is deprecated and will be removed in 3.0.0. Use ClusteringEvaluator instead. You can also get the cost on the training dataset in the summary.
computeCost(Vector) - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel: Computes the squared distance between the input point and the cluster center it belongs to.
computeCost(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel: Computes the sum of squared distances between the input points and their corresponding cluster centers.
computeCost(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel: Java-friendly version of computeCost().
computeCost(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel: Return the K-means cost (sum of squared distances of points to their nearest center) for this model on the given data.
computeCovariance() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Computes the covariance matrix, treating each row as an observation.
computeError(RDD<LabeledPoint>, DecisionTreeRegressionModel[], double[], Loss) - Static method in class org.apache.spark.ml.tree.impl.GradientBoostedTrees: Method to calculate error of the base learner for the gradient boosting calculation.
computeError(org.apache.spark.mllib.tree.model.TreeEnsembleModel, RDD<LabeledPoint>) - Method in interface org.apache.spark.mllib.tree.loss.Loss: Method to calculate error of the base learner for the gradient boosting calculation.
computeError(double, double) - Method in interface org.apache.spark.mllib.tree.loss.Loss: Method to calculate loss when the predictions are already known.
computeFractionForSampleSize(int, long, boolean) - Static method in class org.apache.spark.util.random.SamplingUtils: Returns a sampling rate that guarantees a sample of size greater than or equal to sampleSizeLowerBound 99.99% of the time.
computeGradient(DenseMatrix<Object>, DenseMatrix<Object>, Vector, int) - Method in interface org.apache.spark.ml.ann.TopologyModel: Computes gradient for the network
computeGramianMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix: Computes the Gramian matrix A^T A.
computeGramianMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Computes the Gramian matrix A^T A.
computeInitialPredictionAndError(RDD<LabeledPoint>, double, DecisionTreeRegressionModel, Loss) - Static method in class org.apache.spark.ml.tree.impl.GradientBoostedTrees: Compute the initial predictions and errors for a dataset for the first iteration of gradient boosting.
computeInitialPredictionAndError(RDD<LabeledPoint>, double, DecisionTreeModel, Loss) - Static method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel: :: DeveloperApi :: Compute the initial predictions and errors for a dataset for the first iteration of gradient boosting.
computePreferredLocations(Seq<InputFormatInfo>) - Static method in class org.apache.spark.scheduler.InputFormatInfo: Computes the preferred locations based on input(s) and returned a location to block map.
computePrevDelta(DenseMatrix<Object>, DenseMatrix<Object>, DenseMatrix<Object>) - Method in interface org.apache.spark.ml.ann.LayerModel: Computes the delta for back propagation.
computePrincipalComponents(int) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Computes the top k principal components only.
computePrincipalComponentsAndExplainedVariance(int) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Computes the top k principal components and a vector of proportions of variance explained by each principal component.
computeProbability(double) - Method in interface org.apache.spark.mllib.tree.loss.ClassificationLoss: Computes the class probability given the margin.
computeSilhouetteCoefficient(Broadcast<Map<Object, Tuple2<Vector, Object>>>, Vector, double) - Static method in class org.apache.spark.ml.evaluation.CosineSilhouette: It computes the Silhouette coefficient for a point.
computeSilhouetteCoefficient(Broadcast<Map<Object, SquaredEuclideanSilhouette.ClusterStats>>, Vector, double, double) - Static method in class org.apache.spark.ml.evaluation.SquaredEuclideanSilhouette: It computes the Silhouette coefficient for a point.
computeSilhouetteScore(Dataset<?>, String, String) - Static method in class org.apache.spark.ml.evaluation.CosineSilhouette: Compute the Silhouette score of the dataset using the cosine distance measure.
computeSilhouetteScore(Dataset<?>, String, String) - Static method in class org.apache.spark.ml.evaluation.SquaredEuclideanSilhouette: Compute the Silhouette score of the dataset using squared Euclidean distance measure.
computeSVD(int, boolean, double) - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix: Computes the singular value decomposition of this IndexedRowMatrix.
computeSVD(int, boolean, double) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Computes singular value decomposition of this matrix.
computeThresholdByKey(Map<K, AcceptanceResult>, Map<K, Object>) - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils: Given the result returned by getCounts, determine the threshold for accepting items to generate exact sample size.
concat(Column...) - Static method in class org.apache.spark.sql.functions: Concatenates multiple input columns together into a single column.
concat(Seq<Column>) - Static method in class org.apache.spark.sql.functions: Concatenates multiple input columns together into a single column.
concat_ws(String, Column...) - Static method in class org.apache.spark.sql.functions: Concatenates multiple input string columns together into a single string column, using the given separator.
concat_ws(String, Seq<Column>) - Static method in class org.apache.spark.sql.functions: Concatenates multiple input string columns together into a single string column, using the given separator.
Conf(int, int, double, double, double, double, double, double) - Constructor for class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
conf() - Method in interface org.apache.spark.input.Configurable
conf() - Method in class org.apache.spark.SparkEnv
conf() - Method in class org.apache.spark.sql.hive.RelationConversions
conf() - Method in class org.apache.spark.sql.SparkSession: Runtime configuration interface for Spark.
confidence() - Method in class org.apache.spark.mllib.fpm.AssociationRules.Rule: Returns the confidence of the rule.
confidence() - Method in class org.apache.spark.partial.BoundedDouble
confidence() - Method in class org.apache.spark.util.sketch.CountMinSketch: Returns the confidence (or delta) of this CountMinSketch.
config(String, String) - Method in class org.apache.spark.sql.SparkSession.Builder: Sets a config option.
config(String, long) - Method in class org.apache.spark.sql.SparkSession.Builder: Sets a config option.
config(String, double) - Method in class org.apache.spark.sql.SparkSession.Builder: Sets a config option.
config(String, boolean) - Method in class org.apache.spark.sql.SparkSession.Builder: Sets a config option.
config(SparkConf) - Method in class org.apache.spark.sql.SparkSession.Builder: Sets a list of config options based on the given SparkConf.
config - Class in org.apache.spark.status
config() - Constructor for class org.apache.spark.status.config
ConfigEntryWithDefault<T> - Class in org.apache.spark.internal.config
ConfigEntryWithDefault(String, List<String>, T, Function1<String, T>, Function1<T, String>, String, boolean) - Constructor for class org.apache.spark.internal.config.ConfigEntryWithDefault
ConfigEntryWithDefaultFunction<T> - Class in org.apache.spark.internal.config
ConfigEntryWithDefaultFunction(String, List<String>, Function0<T>, Function1<String, T>, Function1<T, String>, String, boolean) - Constructor for class org.apache.spark.internal.config.ConfigEntryWithDefaultFunction
ConfigEntryWithDefaultString<T> - Class in org.apache.spark.internal.config
ConfigEntryWithDefaultString(String, List<String>, String, Function1<String, T>, Function1<T, String>, String, boolean) - Constructor for class org.apache.spark.internal.config.ConfigEntryWithDefaultString
ConfigHelpers - Class in org.apache.spark.internal.config
ConfigHelpers() - Constructor for class org.apache.spark.internal.config.ConfigHelpers
ConfigProvider - Interface in org.apache.spark.internal.config: A source of configuration values.
configTestLog4j(String) - Static method in class org.apache.spark.TestUtils: config a log4j properties used for testsuite
Configurable - Interface in org.apache.spark.input: A trait to implement Configurable interface.
configuration() - Method in class org.apache.spark.scheduler.InputFormatInfo
CONFIGURATION_INSTANTIATION_LOCK() - Static method in class org.apache.spark.rdd.HadoopRDD: Configuration's constructor is not threadsafe (see SPARK-1097 and HADOOP-10456).
CONFIGURATION_INSTANTIATION_LOCK() - Static method in class org.apache.spark.rdd.NewHadoopRDD: Configuration's constructor is not threadsafe (see SPARK-1097 and HADOOP-10456).
configureJobPropertiesForStorageHandler(TableDesc, Configuration, boolean) - Static method in class org.apache.spark.sql.hive.HiveTableUtil
confusionMatrix() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns confusion matrix: predicted classes are in columns, they are ordered by class label ascending, as in "labels"
connectedComponents() - Method in class org.apache.spark.graphx.GraphOps: Compute the connected component membership of each vertex and return a graph with the vertex value containing the lowest vertex id in the connected component containing that vertex.
connectedComponents(int) - Method in class org.apache.spark.graphx.GraphOps: Compute the connected component membership of each vertex and return a graph with the vertex value containing the lowest vertex id in the connected component containing that vertex.
ConnectedComponents - Class in org.apache.spark.graphx.lib: Connected components algorithm.
ConnectedComponents() - Constructor for class org.apache.spark.graphx.lib.ConnectedComponents
consequent() - Method in class org.apache.spark.mllib.fpm.AssociationRules.Rule
ConstantInputDStream<T> - Class in org.apache.spark.streaming.dstream: An input stream that always returns the same RDD on each time step.
ConstantInputDStream(StreamingContext, RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.ConstantInputDStream
constructTree(org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0.NodeData[]) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$: Given a list of nodes from a tree, construct the tree.
constructTrees(RDD<org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0.NodeData>) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$
constructURIForAuthentication(URI, org.apache.spark.SecurityManager) - Static method in class org.apache.spark.util.Utils: Construct a URI container information used for authentication.
contains(Param<?>) - Method in class org.apache.spark.ml.param.ParamMap: Checks whether a parameter is explicitly specified.
contains(String) - Method in class org.apache.spark.SparkConf: Does the configuration contain a given parameter?
contains(Object) - Method in class org.apache.spark.sql.Column: Contains the other element.
contains(String) - Method in class org.apache.spark.sql.types.Metadata: Tests whether this Metadata contains a binding for a key.
containsDelimiters() - Method in class org.apache.spark.sql.hive.execution.HiveOptions
containsKey(Object) - Method in class org.apache.spark.api.java.JavaUtils.SerializableMapWrapper
containsNull() - Method in class org.apache.spark.sql.types.ArrayType
contentType() - Method in class org.apache.spark.ui.JettyUtils.ServletParams
context() - Method in interface org.apache.spark.api.java.JavaRDDLike: The SparkContext that this RDD was created on.
context() - Method in class org.apache.spark.InterruptibleIterator
context(SQLContext) - Static method in class org.apache.spark.ml.r.RWrappers
context(SQLContext) - Method in interface org.apache.spark.ml.util.BaseReadWrite: Deprecated.
Use session instead. This method will be removed in 3.0.0.
context(SQLContext) - Method in class org.apache.spark.ml.util.GeneralMLWriter
context(SQLContext) - Method in class org.apache.spark.ml.util.MLReader
context(SQLContext) - Method in class org.apache.spark.ml.util.MLWriter
context() - Method in class org.apache.spark.rdd.RDD: The SparkContext that this RDD was created on.
context() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return the StreamingContext associated with this DStream
context() - Method in class org.apache.spark.streaming.dstream.DStream: Return the StreamingContext associated with this DStream
ContextBarrierId - Class in org.apache.spark: For each barrier stage attempt, only at most one barrier() call can be active at any time, thus we can use (stageId, stageAttemptId) to identify the stage attempt where the barrier() call is from.
ContextBarrierId(int, int) - Constructor for class org.apache.spark.ContextBarrierId
Continuous() - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
Continuous(long) - Static method in class org.apache.spark.sql.streaming.Trigger: A trigger that continuously processes streaming data, asynchronously checkpointing at the specified interval.
Continuous(long, TimeUnit) - Static method in class org.apache.spark.sql.streaming.Trigger: A trigger that continuously processes streaming data, asynchronously checkpointing at the specified interval.
Continuous(Duration) - Static method in class org.apache.spark.sql.streaming.Trigger: (Scala-friendly) A trigger that continuously processes streaming data, asynchronously checkpointing at the specified interval.
Continuous(String) - Static method in class org.apache.spark.sql.streaming.Trigger: A trigger that continuously processes streaming data, asynchronously checkpointing at the specified interval.
ContinuousInputPartition<T> - Interface in org.apache.spark.sql.sources.v2.reader: A mix-in interface for InputPartition.
ContinuousInputPartitionReader<T> - Interface in org.apache.spark.sql.sources.v2.reader.streaming: A variation on InputPartitionReader for use with streaming in continuous processing mode.
ContinuousReader - Interface in org.apache.spark.sql.sources.v2.reader.streaming: A mix-in interface for DataSourceReader.
ContinuousReadSupport - Interface in org.apache.spark.sql.sources.v2: A mix-in interface for DataSourceV2.
ContinuousSplit - Class in org.apache.spark.ml.tree: Split which tests a continuous feature.
conv(Column, int, int) - Static method in class org.apache.spark.sql.functions: Convert a number in a string column from one base to another.
CONVERT_METASTORE_ORC() - Static method in class org.apache.spark.sql.hive.HiveUtils
CONVERT_METASTORE_PARQUET() - Static method in class org.apache.spark.sql.hive.HiveUtils
CONVERT_METASTORE_PARQUET_WITH_SCHEMA_MERGING() - Static method in class org.apache.spark.sql.hive.HiveUtils
convertMatrixColumnsFromML(Dataset<?>, String...) - Static method in class org.apache.spark.mllib.util.MLUtils: Converts matrix columns in an input Dataset to the Matrix type from the new Matrix type under the spark.ml package.
convertMatrixColumnsFromML(Dataset<?>, Seq<String>) - Static method in class org.apache.spark.mllib.util.MLUtils: Converts matrix columns in an input Dataset to the Matrix type from the new Matrix type under the spark.ml package.
convertMatrixColumnsToML(Dataset<?>, String...) - Static method in class org.apache.spark.mllib.util.MLUtils: Converts Matrix columns in an input Dataset from the Matrix type to the new Matrix type under the spark.ml package.
convertMatrixColumnsToML(Dataset<?>, Seq<String>) - Static method in class org.apache.spark.mllib.util.MLUtils: Converts Matrix columns in an input Dataset from the Matrix type to the new Matrix type under the spark.ml package.
convertToCanonicalEdges(Function2<ED, ED, ED>) - Method in class org.apache.spark.graphx.GraphOps: Convert bi-directional edges into uni-directional ones.
convertToOldLossType(String) - Method in interface org.apache.spark.ml.tree.GBTRegressorParams
convertToTimeUnit(long, TimeUnit) - Static method in class org.apache.spark.streaming.ui.UIUtils: Convert milliseconds to the specified unit.
convertVectorColumnsFromML(Dataset<?>, String...) - Static method in class org.apache.spark.mllib.util.MLUtils: Converts vector columns in an input Dataset to the Vector type from the new Vector type under the spark.ml package.
convertVectorColumnsFromML(Dataset<?>, Seq<String>) - Static method in class org.apache.spark.mllib.util.MLUtils: Converts vector columns in an input Dataset to the Vector type from the new Vector type under the spark.ml package.
convertVectorColumnsToML(Dataset<?>, String...) - Static method in class org.apache.spark.mllib.util.MLUtils: Converts vector columns in an input Dataset from the Vector type to the new Vector type under the spark.ml package.
convertVectorColumnsToML(Dataset<?>, Seq<String>) - Static method in class org.apache.spark.mllib.util.MLUtils: Converts vector columns in an input Dataset from the Vector type to the new Vector type under the spark.ml package.
CoordinateMatrix - Class in org.apache.spark.mllib.linalg.distributed: Represents a matrix in coordinate format.
CoordinateMatrix(RDD<MatrixEntry>, long, long) - Constructor for class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
CoordinateMatrix(RDD<MatrixEntry>) - Constructor for class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix: Alternative constructor leaving matrix dimensions to be determined automatically.
copy(ParamMap) - Method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
copy(ParamMap) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
copy(ParamMap) - Method in class org.apache.spark.ml.classification.GBTClassificationModel
copy(ParamMap) - Method in class org.apache.spark.ml.classification.GBTClassifier
copy(ParamMap) - Method in class org.apache.spark.ml.classification.LinearSVC
copy(ParamMap) - Method in class org.apache.spark.ml.classification.LinearSVCModel
copy(ParamMap) - Method in class org.apache.spark.ml.classification.LogisticRegression
copy(ParamMap) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
copy(ParamMap) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassificationModel
copy(ParamMap) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier
copy(ParamMap) - Method in class org.apache.spark.ml.classification.NaiveBayes
copy(ParamMap) - Method in class org.apache.spark.ml.classification.NaiveBayesModel
copy(ParamMap) - Method in class org.apache.spark.ml.classification.OneVsRest
copy(ParamMap) - Method in class org.apache.spark.ml.classification.OneVsRestModel
copy(ParamMap) - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel
copy(ParamMap) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
copy(ParamMap) - Method in class org.apache.spark.ml.clustering.BisectingKMeans
copy(ParamMap) - Method in class org.apache.spark.ml.clustering.BisectingKMeansModel
copy(ParamMap) - Method in class org.apache.spark.ml.clustering.DistributedLDAModel
copy(ParamMap) - Method in class org.apache.spark.ml.clustering.GaussianMixture
copy(ParamMap) - Method in class org.apache.spark.ml.clustering.GaussianMixtureModel
copy(ParamMap) - Method in class org.apache.spark.ml.clustering.KMeans
copy(ParamMap) - Method in class org.apache.spark.ml.clustering.KMeansModel
copy(ParamMap) - Method in class org.apache.spark.ml.clustering.LDA
copy(ParamMap) - Method in class org.apache.spark.ml.clustering.LocalLDAModel
copy(ParamMap) - Method in class org.apache.spark.ml.clustering.PowerIterationClustering
copy(ParamMap) - Method in class org.apache.spark.ml.Estimator
copy(ParamMap) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
copy(ParamMap) - Method in class org.apache.spark.ml.evaluation.ClusteringEvaluator
copy(ParamMap) - Method in class org.apache.spark.ml.evaluation.Evaluator
copy(ParamMap) - Method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
copy(ParamMap) - Method in class org.apache.spark.ml.evaluation.RegressionEvaluator
copy(ParamMap) - Method in class org.apache.spark.ml.feature.Binarizer
copy(ParamMap) - Method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSH
copy(ParamMap) - Method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSHModel
copy(ParamMap) - Method in class org.apache.spark.ml.feature.Bucketizer
copy(ParamMap) - Method in class org.apache.spark.ml.feature.ChiSqSelector
copy(ParamMap) - Method in class org.apache.spark.ml.feature.ChiSqSelectorModel
copy(ParamMap) - Method in class org.apache.spark.ml.feature.ColumnPruner
copy(ParamMap) - Method in class org.apache.spark.ml.feature.CountVectorizer
copy(ParamMap) - Method in class org.apache.spark.ml.feature.CountVectorizerModel
copy(ParamMap) - Method in class org.apache.spark.ml.feature.FeatureHasher
copy(ParamMap) - Method in class org.apache.spark.ml.feature.HashingTF
copy(ParamMap) - Method in class org.apache.spark.ml.feature.IDF
copy(ParamMap) - Method in class org.apache.spark.ml.feature.IDFModel
copy(ParamMap) - Method in class org.apache.spark.ml.feature.Imputer
copy(ParamMap) - Method in class org.apache.spark.ml.feature.ImputerModel
copy(ParamMap) - Method in class org.apache.spark.ml.feature.IndexToString
copy(ParamMap) - Method in class org.apache.spark.ml.feature.Interaction
copy(ParamMap) - Method in class org.apache.spark.ml.feature.MaxAbsScaler
copy(ParamMap) - Method in class org.apache.spark.ml.feature.MaxAbsScalerModel
copy(ParamMap) - Method in class org.apache.spark.ml.feature.MinHashLSH
copy(ParamMap) - Method in class org.apache.spark.ml.feature.MinHashLSHModel
copy(ParamMap) - Method in class org.apache.spark.ml.feature.MinMaxScaler
copy(ParamMap) - Method in class org.apache.spark.ml.feature.MinMaxScalerModel
copy(ParamMap) - Method in class org.apache.spark.ml.feature.OneHotEncoder: Deprecated.
copy(ParamMap) - Method in class org.apache.spark.ml.feature.OneHotEncoderEstimator
copy(ParamMap) - Method in class org.apache.spark.ml.feature.OneHotEncoderModel
copy(ParamMap) - Method in class org.apache.spark.ml.feature.PCA
copy(ParamMap) - Method in class org.apache.spark.ml.feature.PCAModel
copy(ParamMap) - Method in class org.apache.spark.ml.feature.PolynomialExpansion
copy(ParamMap) - Method in class org.apache.spark.ml.feature.QuantileDiscretizer
copy(ParamMap) - Method in class org.apache.spark.ml.feature.RegexTokenizer
copy(ParamMap) - Method in class org.apache.spark.ml.feature.RFormula
copy(ParamMap) - Method in class org.apache.spark.ml.feature.RFormulaModel
copy(ParamMap) - Method in class org.apache.spark.ml.feature.SQLTransformer
copy(ParamMap) - Method in class org.apache.spark.ml.feature.StandardScaler
copy(ParamMap) - Method in class org.apache.spark.ml.feature.StandardScalerModel
copy(ParamMap) - Method in class org.apache.spark.ml.feature.StopWordsRemover
copy(ParamMap) - Method in class org.apache.spark.ml.feature.StringIndexer
copy(ParamMap) - Method in class org.apache.spark.ml.feature.StringIndexerModel
copy(ParamMap) - Method in class org.apache.spark.ml.feature.Tokenizer
copy(ParamMap) - Method in class org.apache.spark.ml.feature.VectorAssembler
copy(ParamMap) - Method in class org.apache.spark.ml.feature.VectorAttributeRewriter
copy(ParamMap) - Method in class org.apache.spark.ml.feature.VectorIndexer
copy(ParamMap) - Method in class org.apache.spark.ml.feature.VectorIndexerModel
copy(ParamMap) - Method in class org.apache.spark.ml.feature.VectorSizeHint
copy(ParamMap) - Method in class org.apache.spark.ml.feature.VectorSlicer
copy(ParamMap) - Method in class org.apache.spark.ml.feature.Word2Vec
copy(ParamMap) - Method in class org.apache.spark.ml.feature.Word2VecModel
copy(ParamMap) - Method in class org.apache.spark.ml.fpm.FPGrowth
copy(ParamMap) - Method in class org.apache.spark.ml.fpm.FPGrowthModel
copy(ParamMap) - Method in class org.apache.spark.ml.fpm.PrefixSpan
copy(Vector, Vector) - Static method in class org.apache.spark.ml.linalg.BLAS: y = x
copy() - Method in class org.apache.spark.ml.linalg.DenseMatrix
copy() - Method in class org.apache.spark.ml.linalg.DenseVector
copy() - Method in interface org.apache.spark.ml.linalg.Matrix: Get a deep copy of the matrix.
copy() - Method in class org.apache.spark.ml.linalg.SparseMatrix
copy() - Method in class org.apache.spark.ml.linalg.SparseVector
copy() - Method in interface org.apache.spark.ml.linalg.Vector: Makes a deep copy of this vector.
copy(ParamMap) - Method in class org.apache.spark.ml.Model
copy() - Method in class org.apache.spark.ml.param.ParamMap: Creates a copy of this param map.
copy(ParamMap) - Method in interface org.apache.spark.ml.param.Params: Creates a copy of this instance with the same UID and some extra params.
copy(ParamMap) - Method in class org.apache.spark.ml.Pipeline
copy(ParamMap) - Method in class org.apache.spark.ml.PipelineModel
copy(ParamMap) - Method in class org.apache.spark.ml.PipelineStage
copy(ParamMap) - Method in class org.apache.spark.ml.Predictor
copy(ParamMap) - Method in class org.apache.spark.ml.recommendation.ALS
copy(ParamMap) - Method in class org.apache.spark.ml.recommendation.ALSModel
copy(ParamMap) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
copy(ParamMap) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
copy(ParamMap) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
copy(ParamMap) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
copy(ParamMap) - Method in class org.apache.spark.ml.regression.GBTRegressionModel
copy(ParamMap) - Method in class org.apache.spark.ml.regression.GBTRegressor
copy(ParamMap) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression
copy(ParamMap) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionModel
copy(ParamMap) - Method in class org.apache.spark.ml.regression.IsotonicRegression
copy(ParamMap) - Method in class org.apache.spark.ml.regression.IsotonicRegressionModel
copy(ParamMap) - Method in class org.apache.spark.ml.regression.LinearRegression
copy(ParamMap) - Method in class org.apache.spark.ml.regression.LinearRegressionModel
copy(ParamMap) - Method in class org.apache.spark.ml.regression.RandomForestRegressionModel
copy(ParamMap) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
copy(ParamMap) - Method in class org.apache.spark.ml.Transformer
copy(ParamMap) - Method in class org.apache.spark.ml.tuning.CrossValidator
copy(ParamMap) - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
copy(ParamMap) - Method in class org.apache.spark.ml.tuning.TrainValidationSplit
copy(ParamMap) - Method in class org.apache.spark.ml.tuning.TrainValidationSplitModel
copy(ParamMap) - Method in class org.apache.spark.ml.UnaryTransformer
copy(Vector, Vector) - Static method in class org.apache.spark.mllib.linalg.BLAS: y = x
copy() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
copy() - Method in class org.apache.spark.mllib.linalg.DenseVector
copy() - Method in interface org.apache.spark.mllib.linalg.Matrix: Get a deep copy of the matrix.
copy() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
copy() - Method in class org.apache.spark.mllib.linalg.SparseVector
copy() - Method in interface org.apache.spark.mllib.linalg.Vector: Makes a deep copy of this vector.
copy() - Method in class org.apache.spark.mllib.random.ExponentialGenerator
copy() - Method in class org.apache.spark.mllib.random.GammaGenerator
copy() - Method in class org.apache.spark.mllib.random.LogNormalGenerator
copy() - Method in class org.apache.spark.mllib.random.PoissonGenerator
copy() - Method in interface org.apache.spark.mllib.random.RandomDataGenerator: Returns a copy of the RandomDataGenerator with a new instance of the rng object used in the class when applicable for non-locking concurrent usage.
copy() - Method in class org.apache.spark.mllib.random.StandardNormalGenerator
copy() - Method in class org.apache.spark.mllib.random.UniformGenerator
copy() - Method in class org.apache.spark.mllib.random.WeibullGenerator
copy() - Method in class org.apache.spark.mllib.tree.configuration.Strategy: Returns a shallow copy of this instance.
copy() - Method in interface org.apache.spark.sql.Row: Make a copy of the current Row object.
copy() - Method in class org.apache.spark.sql.vectorized.ColumnarArray
copy() - Method in class org.apache.spark.sql.vectorized.ColumnarMap
copy() - Method in class org.apache.spark.sql.vectorized.ColumnarRow: Revisit this.
copy() - Method in class org.apache.spark.util.AccumulatorV2: Creates a new copy of this accumulator.
copy() - Method in class org.apache.spark.util.CollectionAccumulator
copy() - Method in class org.apache.spark.util.DoubleAccumulator
copy() - Method in class org.apache.spark.util.LegacyAccumulatorWrapper
copy() - Method in class org.apache.spark.util.LongAccumulator
copy() - Method in class org.apache.spark.util.StatCounter: Clone this StatCounter
copyAndReset() - Method in class org.apache.spark.util.AccumulatorV2: Creates a new copy of this accumulator, which is zero value.
copyAndReset() - Method in class org.apache.spark.util.CollectionAccumulator
copyFileStreamNIO(FileChannel, FileChannel, long, long) - Static method in class org.apache.spark.util.Utils
copyStream(InputStream, OutputStream, boolean, boolean) - Static method in class org.apache.spark.util.Utils: Copy all data from an InputStream to an OutputStream.
copyValues(T, ParamMap) - Method in interface org.apache.spark.ml.param.Params: Copies param values from this instance to another instance for params shared by them.
cores() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor
coresGranted() - Method in class org.apache.spark.status.api.v1.ApplicationInfo
coresPerExecutor() - Method in class org.apache.spark.status.api.v1.ApplicationInfo
corr(Dataset<?>, String, String) - Static method in class org.apache.spark.ml.stat.Correlation: :: Experimental :: Compute the correlation matrix for the input Dataset of Vectors using the specified method.
corr(Dataset<?>, String) - Static method in class org.apache.spark.ml.stat.Correlation: Compute the Pearson correlation matrix for the input Dataset of Vectors.
corr(RDD<Object>, RDD<Object>, String) - Static method in class org.apache.spark.mllib.stat.correlation.Correlations
corr(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.Statistics: Compute the Pearson correlation matrix for the input RDD of Vectors.
corr(RDD<Vector>, String) - Static method in class org.apache.spark.mllib.stat.Statistics: Compute the correlation matrix for the input RDD of Vectors using the specified method.
corr(RDD<Object>, RDD<Object>) - Static method in class org.apache.spark.mllib.stat.Statistics: Compute the Pearson correlation for the input RDDs.
corr(JavaRDD<Double>, JavaRDD<Double>) - Static method in class org.apache.spark.mllib.stat.Statistics: Java-friendly version of corr()
corr(RDD<Object>, RDD<Object>, String) - Static method in class org.apache.spark.mllib.stat.Statistics: Compute the correlation for the input RDDs using the specified method.
corr(JavaRDD<Double>, JavaRDD<Double>, String) - Static method in class org.apache.spark.mllib.stat.Statistics: Java-friendly version of corr()
corr(String, String, String) - Method in class org.apache.spark.sql.DataFrameStatFunctions: Calculates the correlation of two columns of a DataFrame.
corr(String, String) - Method in class org.apache.spark.sql.DataFrameStatFunctions: Calculates the Pearson Correlation Coefficient of two columns of a DataFrame.
corr(Column, Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the Pearson Correlation Coefficient for two columns.
corr(String, String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the Pearson Correlation Coefficient for two columns.
Correlation - Class in org.apache.spark.ml.stat: API for correlation functions in MLlib, compatible with DataFrames and Datasets.
Correlation() - Constructor for class org.apache.spark.ml.stat.Correlation
Correlation - Interface in org.apache.spark.mllib.stat.correlation: Trait for correlation algorithms.
CorrelationNames - Class in org.apache.spark.mllib.stat.correlation: Maintains supported and default correlation names.
CorrelationNames() - Constructor for class org.apache.spark.mllib.stat.correlation.CorrelationNames
Correlations - Class in org.apache.spark.mllib.stat.correlation: Delegates computation to the specific correlation object based on the input method name.
Correlations() - Constructor for class org.apache.spark.mllib.stat.correlation.Correlations
corrMatrix(RDD<Vector>, String) - Static method in class org.apache.spark.mllib.stat.correlation.Correlations
cos(Column) - Static method in class org.apache.spark.sql.functions
cos(String) - Static method in class org.apache.spark.sql.functions
cosh(Column) - Static method in class org.apache.spark.sql.functions
cosh(String) - Static method in class org.apache.spark.sql.functions
CosineSilhouette - Class in org.apache.spark.ml.evaluation: The algorithm which is implemented in this object, instead, is an efficient and parallel implementation of the Silhouette using the cosine distance measure.
CosineSilhouette() - Constructor for class org.apache.spark.ml.evaluation.CosineSilhouette
count() - Method in interface org.apache.spark.api.java.JavaRDDLike: Return the number of elements in the RDD.
count() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl: The number of edges in the RDD.
count() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl: The number of vertices in the RDD.
count() - Method in class org.apache.spark.ml.clustering.ExpectationAggregator
count() - Method in class org.apache.spark.ml.regression.AFTAggregator
count(Column, Column) - Static method in class org.apache.spark.ml.stat.Summarizer
count(Column) - Static method in class org.apache.spark.ml.stat.Summarizer
count() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer: Sample size.
count() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary: Sample size.
count() - Method in class org.apache.spark.rdd.RDD: Return the number of elements in the RDD.
count() - Method in class org.apache.spark.sql.Dataset: Returns the number of rows in the Dataset.
count(MapFunction<T, Object>) - Static method in class org.apache.spark.sql.expressions.javalang.typed: Count aggregate function.
count(Function1<IN, Object>) - Static method in class org.apache.spark.sql.expressions.scalalang.typed: Count aggregate function.
count(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the number of items in a group.
count(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the number of items in a group.
count() - Method in class org.apache.spark.sql.KeyValueGroupedDataset: Returns a Dataset that contains a tuple with each key and the number of items present for that key.
count() - Method in class org.apache.spark.sql.RelationalGroupedDataset: Count the number of rows for each group.
count() - Method in class org.apache.spark.status.RDDPartitionSeq
count() - Method in class org.apache.spark.storage.ReadableChannelFileRegion
count() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD has a single element generated by counting each RDD of this DStream.
count() - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD has a single element generated by counting each RDD of this DStream.
count() - Method in class org.apache.spark.util.DoubleAccumulator: Returns the number of elements added to the accumulator.
count() - Method in class org.apache.spark.util.LongAccumulator: Returns the number of elements added to the accumulator.
count() - Method in class org.apache.spark.util.StatCounter
countApprox(long, double) - Method in interface org.apache.spark.api.java.JavaRDDLike: Approximate version of count() that returns a potentially incomplete result within a timeout, even if not all tasks have finished.
countApprox(long) - Method in interface org.apache.spark.api.java.JavaRDDLike: Approximate version of count() that returns a potentially incomplete result within a timeout, even if not all tasks have finished.
countApprox(long, double) - Method in class org.apache.spark.rdd.RDD: Approximate version of count() that returns a potentially incomplete result within a timeout, even if not all tasks have finished.
countApproxDistinct(double) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return approximate number of distinct elements in the RDD.
countApproxDistinct(int, int) - Method in class org.apache.spark.rdd.RDD: Return approximate number of distinct elements in the RDD.
countApproxDistinct(double) - Method in class org.apache.spark.rdd.RDD: Return approximate number of distinct elements in the RDD.
countApproxDistinctByKey(double, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double) - Method in class org.apache.spark.api.java.JavaPairRDD: Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(int, int, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return approximate number of distinct values for each key in this RDD.
countAsync() - Method in interface org.apache.spark.api.java.JavaRDDLike: The asynchronous version of count, which returns a future for counting the number of elements in this RDD.
countAsync() - Method in class org.apache.spark.rdd.AsyncRDDActions: Returns a future for counting the number of elements in the RDD.
countByKey() - Method in class org.apache.spark.api.java.JavaPairRDD: Count the number of elements for each key, and return the result to the master as a Map.
countByKey() - Method in class org.apache.spark.rdd.PairRDDFunctions: Count the number of elements for each key, collecting the results to a local Map.
countByKeyApprox(long) - Method in class org.apache.spark.api.java.JavaPairRDD: Approximate version of countByKey that can return a partial result if it does not finish within a timeout.
countByKeyApprox(long, double) - Method in class org.apache.spark.api.java.JavaPairRDD: Approximate version of countByKey that can return a partial result if it does not finish within a timeout.
countByKeyApprox(long, double) - Method in class org.apache.spark.rdd.PairRDDFunctions: Approximate version of countByKey that can return a partial result if it does not finish within a timeout.
countByValue() - Method in interface org.apache.spark.api.java.JavaRDDLike: Return the count of each unique value in this RDD as a map of (value, count) pairs.
countByValue(Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Return the count of each unique value in this RDD as a local map of (value, count) pairs.
countByValue() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD contains the counts of each distinct value in each RDD of this DStream.
countByValue(int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD contains the counts of each distinct value in each RDD of this DStream.
countByValue(int, Ordering<T>) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD contains the counts of each distinct value in each RDD of this DStream.
countByValueAndWindow(Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD contains the count of distinct elements in RDDs in a sliding window over this DStream.
countByValueAndWindow(Duration, Duration, int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD contains the count of distinct elements in RDDs in a sliding window over this DStream.
countByValueAndWindow(Duration, Duration, int, Ordering<T>) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD contains the count of distinct elements in RDDs in a sliding window over this DStream.
countByValueApprox(long, double) - Method in interface org.apache.spark.api.java.JavaRDDLike: Approximate version of countByValue().
countByValueApprox(long) - Method in interface org.apache.spark.api.java.JavaRDDLike: Approximate version of countByValue().
countByValueApprox(long, double, Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Approximate version of countByValue().
countByWindow(Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD has a single element generated by counting the number of elements in a window over this DStream.
countByWindow(Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD has a single element generated by counting the number of elements in a sliding window over this DStream.
countDistinct(Column, Column...) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the number of distinct items in a group.
countDistinct(String, String...) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the number of distinct items in a group.
countDistinct(Column, Seq<Column>) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the number of distinct items in a group.
countDistinct(String, Seq<String>) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the number of distinct items in a group.
COUNTER() - Static method in class org.apache.spark.metrics.sink.StatsdMetricType
CountingWritableChannel - Class in org.apache.spark.storage
CountingWritableChannel(WritableByteChannel) - Constructor for class org.apache.spark.storage.CountingWritableChannel
countMinSketch(String, int, int, int) - Method in class org.apache.spark.sql.DataFrameStatFunctions: Builds a Count-min Sketch over a specified column.
countMinSketch(String, double, double, int) - Method in class org.apache.spark.sql.DataFrameStatFunctions: Builds a Count-min Sketch over a specified column.
countMinSketch(Column, int, int, int) - Method in class org.apache.spark.sql.DataFrameStatFunctions: Builds a Count-min Sketch over a specified column.
countMinSketch(Column, double, double, int) - Method in class org.apache.spark.sql.DataFrameStatFunctions: Builds a Count-min Sketch over a specified column.
CountMinSketch - Class in org.apache.spark.util.sketch: A Count-min sketch is a probabilistic data structure used for cardinality estimation using sub-linear space.
CountMinSketch() - Constructor for class org.apache.spark.util.sketch.CountMinSketch
CountMinSketch.Version - Enum in org.apache.spark.util.sketch
countTowardsTaskFailures() - Method in class org.apache.spark.ExecutorLostFailure
countTowardsTaskFailures() - Method in class org.apache.spark.FetchFailed: Fetch failures lead to a different failure handling path: (1) we don't abort the stage after 4 task failures, instead we immediately go back to the stage which generated the map output, and regenerate the missing data.
countTowardsTaskFailures() - Static method in class org.apache.spark.Resubmitted
countTowardsTaskFailures() - Method in class org.apache.spark.TaskCommitDenied: If a task failed because its attempt to commit was denied, do not count this failure towards failing the stage.
countTowardsTaskFailures() - Method in interface org.apache.spark.TaskFailedReason: Whether this task failure should be counted towards the maximum number of times the task is allowed to fail before the stage is aborted.
countTowardsTaskFailures() - Method in class org.apache.spark.TaskKilled
countTowardsTaskFailures() - Static method in class org.apache.spark.TaskResultLost
countTowardsTaskFailures() - Static method in class org.apache.spark.UnknownReason
CountVectorizer - Class in org.apache.spark.ml.feature: Extracts a vocabulary from document collections and generates a CountVectorizerModel.
CountVectorizer(String) - Constructor for class org.apache.spark.ml.feature.CountVectorizer
CountVectorizer() - Constructor for class org.apache.spark.ml.feature.CountVectorizer
CountVectorizerModel - Class in org.apache.spark.ml.feature: Converts a text document to a sparse vector of token counts.
CountVectorizerModel(String, String[]) - Constructor for class org.apache.spark.ml.feature.CountVectorizerModel
CountVectorizerModel(String[]) - Constructor for class org.apache.spark.ml.feature.CountVectorizerModel
CountVectorizerParams - Interface in org.apache.spark.ml.feature: Params for CountVectorizer and CountVectorizerModel.
cov() - Method in class org.apache.spark.ml.stat.distribution.MultivariateGaussian
cov(String, String) - Method in class org.apache.spark.sql.DataFrameStatFunctions: Calculate the sample covariance of two numerical columns of a DataFrame.
covar_pop(Column, Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the population covariance for two columns.
covar_pop(String, String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the population covariance for two columns.
covar_samp(Column, Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the sample covariance for two columns.
covar_samp(String, String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the sample covariance for two columns.
covs() - Method in class org.apache.spark.ml.clustering.ExpectationAggregator
crc32(Column) - Static method in class org.apache.spark.sql.functions: Calculates the cyclic redundancy check value (CRC32) of a binary column and returns the value as a bigint.
CreatableRelationProvider - Interface in org.apache.spark.sql.sources
create(boolean, boolean, boolean, boolean, int) - Static method in class org.apache.spark.api.java.StorageLevels: Create a new StorageLevel object.
create(JavaSparkContext, JdbcRDD.ConnectionFactory, String, long, long, int, Function<ResultSet, T>) - Static method in class org.apache.spark.rdd.JdbcRDD: Create an RDD that executes a SQL query on a JDBC connection and reads results.
create(JavaSparkContext, JdbcRDD.ConnectionFactory, String, long, long, int) - Static method in class org.apache.spark.rdd.JdbcRDD: Create an RDD that executes a SQL query on a JDBC connection and reads results.
create(RDD<T>, Function1<Object, Object>) - Static method in class org.apache.spark.rdd.PartitionPruningRDD: Create a PartitionPruningRDD.
create(RpcEnvConfig) - Method in interface org.apache.spark.rpc.RpcEnvFactory
create(Object, DataType, Seq<Option<ScalaReflection.Schema>>) - Static method in class org.apache.spark.sql.expressions.SparkUserDefinedFunction
create(Object...) - Static method in class org.apache.spark.sql.RowFactory: Create a Row from the given arguments.
create(String) - Static method in class org.apache.spark.sql.streaming.ProcessingTime: Deprecated.
use Trigger.ProcessingTime(interval)
create(long, TimeUnit) - Static method in class org.apache.spark.sql.streaming.ProcessingTime: Deprecated.
use Trigger.ProcessingTime(interval, unit)
create(long) - Static method in class org.apache.spark.util.sketch.BloomFilter: Creates a BloomFilter with the expected number of insertions and a default expected false positive probability of 3%.
create(long, double) - Static method in class org.apache.spark.util.sketch.BloomFilter: Creates a BloomFilter with the expected number of insertions and expected false positive probability.
create(long, long) - Static method in class org.apache.spark.util.sketch.BloomFilter: Creates a BloomFilter with given expectedNumItems and numBits, it will pick an optimal numHashFunctions which can minimize fpp for the bloom filter.
create(int, int, int) - Static method in class org.apache.spark.util.sketch.CountMinSketch: Creates a CountMinSketch with given depth, width, and random seed.
create(double, double, int) - Static method in class org.apache.spark.util.sketch.CountMinSketch: Creates a CountMinSketch with given relative error (eps), confidence, and random seed.
createArrayType(DataType) - Static method in class org.apache.spark.sql.types.DataTypes: Creates an ArrayType by specifying the data type of elements (elementType).
createArrayType(DataType, boolean) - Static method in class org.apache.spark.sql.types.DataTypes: Creates an ArrayType by specifying the data type of elements (elementType) and whether the array contains null values (containsNull).
createAttrGroupForAttrNames(String, int, boolean, boolean) - Static method in class org.apache.spark.ml.feature.OneHotEncoderCommon: Creates an `AttributeGroup` with the required number of `BinaryAttribute`.
createCombiner() - Method in class org.apache.spark.Aggregator
createCommitter(int) - Method in class org.apache.spark.internal.io.HadoopWriteConfigUtil
createCompiledClass(String, File, TestUtils.JavaSourceFromString, Seq<URL>) - Static method in class org.apache.spark.TestUtils: Creates a compiled class with the source file.
createCompiledClass(String, File, String, String, Seq<URL>) - Static method in class org.apache.spark.TestUtils: Creates a compiled class with the given name.
createContinuousReader(Optional<StructType>, String, DataSourceOptions) - Method in interface org.apache.spark.sql.sources.v2.ContinuousReadSupport: Creates a ContinuousReader to scan the data from this data source.
createContinuousReader(PartitionOffset) - Method in interface org.apache.spark.sql.sources.v2.reader.ContinuousInputPartition: Create an input partition reader with particular offset as its startOffset.
createCryptoInputStream(InputStream, SparkConf, byte[]) - Static method in class org.apache.spark.security.CryptoStreamUtils: Helper method to wrap InputStream with CryptoInputStream for decryption.
createCryptoOutputStream(OutputStream, SparkConf, byte[]) - Static method in class org.apache.spark.security.CryptoStreamUtils: Helper method to wrap OutputStream with CryptoOutputStream for encryption.
createDatabase(CatalogDatabase, boolean) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Creates a new database with the given name.
createDataFrame(RDD<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SparkSession: :: Experimental :: Creates a DataFrame from an RDD of Product (e.g.
createDataFrame(Seq<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SparkSession: :: Experimental :: Creates a DataFrame from a local Seq of Product.
createDataFrame(RDD<Row>, StructType) - Method in class org.apache.spark.sql.SparkSession: :: DeveloperApi :: Creates a DataFrame from an RDD containing Rows using the given schema.
createDataFrame(JavaRDD<Row>, StructType) - Method in class org.apache.spark.sql.SparkSession: :: DeveloperApi :: Creates a DataFrame from a JavaRDD containing Rows using the given schema.
createDataFrame(List<Row>, StructType) - Method in class org.apache.spark.sql.SparkSession: :: DeveloperApi :: Creates a DataFrame from a java.util.List containing Rows using the given schema.
createDataFrame(RDD<?>, Class<?>) - Method in class org.apache.spark.sql.SparkSession: Applies a schema to an RDD of Java Beans.
createDataFrame(JavaRDD<?>, Class<?>) - Method in class org.apache.spark.sql.SparkSession: Applies a schema to an RDD of Java Beans.
createDataFrame(List<?>, Class<?>) - Method in class org.apache.spark.sql.SparkSession: Applies a schema to a List of Java Beans.
createDataFrame(RDD<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLContext
createDataFrame(Seq<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLContext
createDataFrame(RDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
createDataFrame(JavaRDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
createDataFrame(List<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
createDataFrame(RDD<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
createDataFrame(JavaRDD<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
createDataFrame(List<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
createDataset(Seq<T>, Encoder<T>) - Method in class org.apache.spark.sql.SparkSession: :: Experimental :: Creates a Dataset from a local Seq of data of a given type.
createDataset(RDD<T>, Encoder<T>) - Method in class org.apache.spark.sql.SparkSession: :: Experimental :: Creates a Dataset from an RDD of a given type.
createDataset(List<T>, Encoder<T>) - Method in class org.apache.spark.sql.SparkSession: :: Experimental :: Creates a Dataset from a java.util.List of a given type.
createDataset(Seq<T>, Encoder<T>) - Method in class org.apache.spark.sql.SQLContext
createDataset(RDD<T>, Encoder<T>) - Method in class org.apache.spark.sql.SQLContext
createDataset(List<T>, Encoder<T>) - Method in class org.apache.spark.sql.SQLContext
createDataWriter(int, long, long) - Method in interface org.apache.spark.sql.sources.v2.writer.DataWriterFactory: Returns a data writer to do the actual writing work.
createDecimalType(int, int) - Static method in class org.apache.spark.sql.types.DataTypes: Creates a DecimalType by specifying the precision and scale.
createDecimalType() - Static method in class org.apache.spark.sql.types.DataTypes: Creates a DecimalType with default precision and scale, which are 10 and 0.
createDF(RDD<byte[]>, StructType, SparkSession) - Static method in class org.apache.spark.sql.api.r.SQLUtils
createDirectory(String, String) - Static method in class org.apache.spark.util.Utils: Create a directory inside the given parent directory.
createdTempDir() - Method in interface org.apache.spark.sql.hive.execution.SaveAsHiveFile
createExternalTable(String, String) - Method in class org.apache.spark.sql.catalog.Catalog: Deprecated.
use createTable instead. Since 2.2.0.
createExternalTable(String, String, String) - Method in class org.apache.spark.sql.catalog.Catalog: Deprecated.
use createTable instead. Since 2.2.0.
createExternalTable(String, String, Map<String, String>) - Method in class org.apache.spark.sql.catalog.Catalog: Deprecated.
use createTable instead. Since 2.2.0.
createExternalTable(String, String, Map<String, String>) - Method in class org.apache.spark.sql.catalog.Catalog: Deprecated.
use createTable instead. Since 2.2.0.
createExternalTable(String, String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.catalog.Catalog: Deprecated.
use createTable instead. Since 2.2.0.
createExternalTable(String, String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.catalog.Catalog: Deprecated.
use createTable instead. Since 2.2.0.
createExternalTable(String, String) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
use sparkSession.catalog.createTable instead. Since 2.2.0.
createExternalTable(String, String, String) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
use sparkSession.catalog.createTable instead. Since 2.2.0.
createExternalTable(String, String, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
use sparkSession.catalog.createTable instead. Since 2.2.0.
createExternalTable(String, String, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
use sparkSession.catalog.createTable instead. Since 2.2.0.
createExternalTable(String, String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
use sparkSession.catalog.createTable instead. Since 2.2.0.
createExternalTable(String, String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
use sparkSession.catalog.createTable instead. Since 2.2.0.
createFilter(StructType, Filter[]) - Static method in class org.apache.spark.sql.hive.orc.OrcFilters
createFunction(String, CatalogFunction) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Create a function in an existing database.
createGlobalTempView(String) - Method in class org.apache.spark.sql.Dataset: Creates a global temporary view using the given name.
CreateHiveTableAsSelectCommand - Class in org.apache.spark.sql.hive.execution: Create table and insert the query result into it.
CreateHiveTableAsSelectCommand(CatalogTable, LogicalPlan, Seq<String>, SaveMode) - Constructor for class org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand
createJar(Seq<File>, File, Option<String>) - Static method in class org.apache.spark.TestUtils: Create a jar file that contains this set of files.
createJarWithClasses(Seq<String>, String, Seq<Tuple2<String, String>>, Seq<URL>) - Static method in class org.apache.spark.TestUtils: Create a jar that defines classes with the given names.
createJarWithFiles(Map<String, String>, File) - Static method in class org.apache.spark.TestUtils: Create a jar file containing multiple files.
createJobContext(String, int) - Method in class org.apache.spark.internal.io.HadoopWriteConfigUtil
createJobID(Date, int) - Static method in class org.apache.spark.internal.io.SparkHadoopWriterUtils
createJobTrackerID(Date) - Static method in class org.apache.spark.internal.io.SparkHadoopWriterUtils
createKey(SparkConf) - Static method in class org.apache.spark.security.CryptoStreamUtils: Creates a new encryption key.
createListeners(SparkConf, ElementTrackingStore) - Method in interface org.apache.spark.status.AppHistoryServerPlugin: Creates listeners to replay the event logs.
createLogForDriver(SparkConf, String, Configuration) - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils: Create a WriteAheadLog for the driver.
createLogForReceiver(SparkConf, String, Configuration) - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils: Create a WriteAheadLog for the receiver.
createMapType(DataType, DataType) - Static method in class org.apache.spark.sql.types.DataTypes: Creates a MapType by specifying the data type of keys (keyType) and values (keyType).
createMapType(DataType, DataType, boolean) - Static method in class org.apache.spark.sql.types.DataTypes: Creates a MapType by specifying the data type of keys (keyType), the data type of values (keyType), and whether values contain any null value (valueContainsNull).
createMetrics(long, long, long, long, long, long, long, long, long, long, long, long, long, long, long, long, long, long, long, long, long, long, long, long) - Static method in class org.apache.spark.status.LiveEntityHelpers
createMetrics(long) - Static method in class org.apache.spark.status.LiveEntityHelpers
createMicroBatchReader(Optional<StructType>, String, DataSourceOptions) - Method in interface org.apache.spark.sql.sources.v2.MicroBatchReadSupport: Creates a MicroBatchReader to read batches of data from this data source in a streaming query.
createModel(DenseVector<Object>) - Method in interface org.apache.spark.ml.ann.Layer: Returns the instance of the layer based on weights provided.
createOrReplaceGlobalTempView(String) - Method in class org.apache.spark.sql.Dataset: Creates or replaces a global temporary view using the given name.
createOrReplaceTempView(String) - Method in class org.apache.spark.sql.Dataset: Creates a local temporary view using the given name.
createOutputOperationFailureForUI(String) - Static method in class org.apache.spark.streaming.ui.UIUtils
createPartitionReader() - Method in interface org.apache.spark.sql.sources.v2.reader.InputPartition: Returns an input partition reader to do the actual reading work.
createPartitions(String, String, Seq<CatalogTablePartition>, boolean) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Create one or many partitions in the given table.
createPathFromString(String, JobConf) - Static method in class org.apache.spark.internal.io.SparkHadoopWriterUtils
createPMMLModelExport(Object) - Static method in class org.apache.spark.mllib.pmml.export.PMMLModelExportFactory: Factory object to help creating the necessary PMMLModelExport implementation taking as input the machine learning model (for example KMeansModel).
createProxyHandler(Function1<String, Option<String>>) - Static method in class org.apache.spark.ui.JettyUtils: Create a handler for proxying request to Workers and Application Drivers
createProxyLocationHeader(String, HttpServletRequest, URI) - Static method in class org.apache.spark.ui.JettyUtils
createProxyURI(String, String, String, String) - Static method in class org.apache.spark.ui.JettyUtils
createRDDFromArray(JavaSparkContext, byte[][]) - Static method in class org.apache.spark.api.r.RRDD: Create an RRDD given a sequence of byte arrays.
createRDDFromFile(JavaSparkContext, String, int) - Static method in class org.apache.spark.api.r.RRDD: Create an RRDD given a temporary file name.
createReadableChannel(ReadableByteChannel, SparkConf, byte[]) - Static method in class org.apache.spark.security.CryptoStreamUtils: Wrap a ReadableByteChannel for decryption.
createReader(StructType, DataSourceOptions) - Method in interface org.apache.spark.sql.sources.v2.ReadSupport: Creates a DataSourceReader to scan the data from this data source.
createReader(DataSourceOptions) - Method in interface org.apache.spark.sql.sources.v2.ReadSupport: Creates a DataSourceReader to scan the data from this data source.
createRedirectHandler(String, String, Function1<HttpServletRequest, BoxedUnit>, String, Set<String>) - Static method in class org.apache.spark.ui.JettyUtils: Create a handler that always redirects the user to the given path
createRelation(SQLContext, SaveMode, Map<String, String>, Dataset<Row>) - Method in interface org.apache.spark.sql.sources.CreatableRelationProvider: Saves a DataFrame to a destination (using data source-specific parameters)
createRelation(SQLContext, Map<String, String>) - Method in interface org.apache.spark.sql.sources.RelationProvider: Returns a new base relation with the given parameters.
createRelation(SQLContext, Map<String, String>, StructType) - Method in interface org.apache.spark.sql.sources.SchemaRelationProvider: Returns a new base relation with the given parameters and user defined schema.
createSchedulerBackend(SparkContext, String, TaskScheduler) - Method in interface org.apache.spark.scheduler.ExternalClusterManager: Create a scheduler backend for the given SparkContext and scheduler.
createSecret(SparkConf) - Static method in class org.apache.spark.util.Utils
createServlet(JettyUtils.ServletParams<T>, org.apache.spark.SecurityManager, SparkConf) - Static method in class org.apache.spark.ui.JettyUtils
createServletHandler(String, JettyUtils.ServletParams<T>, org.apache.spark.SecurityManager, SparkConf, String) - Static method in class org.apache.spark.ui.JettyUtils: Create a context handler that responds to a request with the given path prefix
createServletHandler(String, HttpServlet, String) - Static method in class org.apache.spark.ui.JettyUtils: Create a context handler that responds to a request with the given path prefix
createSink(SQLContext, Map<String, String>, Seq<String>, OutputMode) - Method in interface org.apache.spark.sql.sources.StreamSinkProvider
createSource(SQLContext, String, Option<StructType>, String, Map<String, String>) - Method in interface org.apache.spark.sql.sources.StreamSourceProvider
createSparkContext(String, String, String, String[], Map<Object, Object>, Map<Object, Object>) - Static method in class org.apache.spark.api.r.RRDD
createStaticHandler(String, String) - Static method in class org.apache.spark.ui.JettyUtils: Create a handler for serving files from a static directory
createStream(StreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel, Function1<Record, T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils: Deprecated.
Use KinesisInputDStream.builder instead. Since 2.2.0.
createStream(StreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel, Function1<Record, T>, String, String, ClassTag<T>) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils: Deprecated.
Use KinesisInputDStream.builder instead. Since 2.2.0.
createStream(StreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel, Function1<Record, T>, String, String, String, String, String, ClassTag<T>) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils: Deprecated.
Use KinesisInputDStream.builder instead. Since 2.2.0.
createStream(StreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils: Deprecated.
Use KinesisInputDStream.builder instead. Since 2.2.0.
createStream(StreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel, String, String) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils: Deprecated.
Use KinesisInputDStream.builder instead. Since 2.2.0.
createStream(JavaStreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel, Function<Record, T>, Class<T>) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils: Deprecated.
Use KinesisInputDStream.builder instead. Since 2.2.0.
createStream(JavaStreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel, Function<Record, T>, Class<T>, String, String) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils: Deprecated.
Use KinesisInputDStream.builder instead. Since 2.2.0.
createStream(JavaStreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel, Function<Record, T>, Class<T>, String, String, String, String, String) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils: Deprecated.
Use KinesisInputDStream.builder instead. Since 2.2.0.
createStream(JavaStreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils: Deprecated.
Use KinesisInputDStream.builder instead. Since 2.2.0.
createStream(JavaStreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel, String, String) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils: Deprecated.
Use KinesisInputDStream.builder instead. Since 2.2.0.
createStream(JavaStreamingContext, String, String, String, String, int, Duration, StorageLevel, String, String, String, String, String) - Method in class org.apache.spark.streaming.kinesis.KinesisUtilsPythonHelper
createStreamWriter(String, StructType, OutputMode, DataSourceOptions) - Method in interface org.apache.spark.sql.sources.v2.StreamWriteSupport: Creates an optional StreamWriter to save the data to this data source.
createStructField(String, String, boolean) - Static method in class org.apache.spark.sql.api.r.SQLUtils
createStructField(String, DataType, boolean, Metadata) - Static method in class org.apache.spark.sql.types.DataTypes: Creates a StructField by specifying the name (name), data type (dataType) and whether values of this field can be null values (nullable).
createStructField(String, DataType, boolean) - Static method in class org.apache.spark.sql.types.DataTypes: Creates a StructField with empty metadata.
createStructType(Seq<StructField>) - Static method in class org.apache.spark.sql.api.r.SQLUtils
createStructType(List<StructField>) - Static method in class org.apache.spark.sql.types.DataTypes: Creates a StructType with the given list of StructFields (fields).
createStructType(StructField[]) - Static method in class org.apache.spark.sql.types.DataTypes: Creates a StructType with the given StructField array (fields).
createTable(String, String) - Method in class org.apache.spark.sql.catalog.Catalog: :: Experimental :: Creates a table from the given path and returns the corresponding DataFrame.
createTable(String, String, String) - Method in class org.apache.spark.sql.catalog.Catalog: :: Experimental :: Creates a table from the given path based on a data source and returns the corresponding DataFrame.
createTable(String, String, Map<String, String>) - Method in class org.apache.spark.sql.catalog.Catalog: :: Experimental :: Creates a table based on the dataset in a data source and a set of options.
createTable(String, String, Map<String, String>) - Method in class org.apache.spark.sql.catalog.Catalog: :: Experimental :: (Scala-specific) Creates a table based on the dataset in a data source and a set of options.
createTable(String, String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.catalog.Catalog: :: Experimental :: Create a table based on the dataset in a data source, a schema and a set of options.
createTable(String, String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.catalog.Catalog: :: Experimental :: (Scala-specific) Create a table based on the dataset in a data source, a schema and a set of options.
createTable(CatalogTable, boolean) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Creates a table with the given metadata.
createTaskAttemptContext(String, int, int, int) - Method in class org.apache.spark.internal.io.HadoopWriteConfigUtil
createTaskScheduler(SparkContext, String) - Method in interface org.apache.spark.scheduler.ExternalClusterManager: Create a task scheduler instance for the given SparkContext
createTempDir(String, String) - Static method in class org.apache.spark.util.Utils: Create a temporary directory inside the given parent directory.
createTempView(String) - Method in class org.apache.spark.sql.Dataset: Creates a local temporary view using the given name.
createUnsafe(long, int, int) - Static method in class org.apache.spark.sql.types.Decimal: Creates a decimal from unscaled, precision and scale without checking the bounds.
createWorkspace(int) - Static method in class org.apache.spark.mllib.optimization.NNLS
createWritableChannel(WritableByteChannel, SparkConf, byte[]) - Static method in class org.apache.spark.security.CryptoStreamUtils: Wrap a WritableByteChannel for encryption.
createWriter(String, StructType, SaveMode, DataSourceOptions) - Method in interface org.apache.spark.sql.sources.v2.WriteSupport: Creates an optional DataSourceWriter to save the data to this data source.
createWriterFactory() - Method in interface org.apache.spark.sql.sources.v2.writer.DataSourceWriter: Creates a writer factory which will be serialized and sent to executors.
crossJoin(Dataset<?>) - Method in class org.apache.spark.sql.Dataset: Explicit cartesian join with another DataFrame.
crosstab(String, String) - Method in class org.apache.spark.sql.DataFrameStatFunctions: Computes a pair-wise frequency table of the given columns.
CrossValidator - Class in org.apache.spark.ml.tuning: K-fold cross validation performs model selection by splitting the dataset into a set of non-overlapping randomly partitioned folds which are used as separate training and test datasets e.g., with k=3 folds, K-fold cross validation will generate 3 (training, test) dataset pairs, each of which uses 2/3 of the data for training and 1/3 for testing.
CrossValidator(String) - Constructor for class org.apache.spark.ml.tuning.CrossValidator
CrossValidator() - Constructor for class org.apache.spark.ml.tuning.CrossValidator
CrossValidatorModel - Class in org.apache.spark.ml.tuning: CrossValidatorModel contains the model with the highest average cross-validation metric across folds and uses this model to transform input data.
CrossValidatorModel.CrossValidatorModelWriter - Class in org.apache.spark.ml.tuning: Writer for CrossValidatorModel.
CrossValidatorParams - Interface in org.apache.spark.ml.tuning: Params for CrossValidator and CrossValidatorModel.
CryptoStreamUtils - Class in org.apache.spark.security: A util class for manipulating IO encryption and decryption streams.
CryptoStreamUtils() - Constructor for class org.apache.spark.security.CryptoStreamUtils
CryptoStreamUtils.BaseErrorHandler - Interface in org.apache.spark.security: SPARK-25535.
CryptoStreamUtils.ErrorHandlingReadableChannel - Class in org.apache.spark.security
csv(String...) - Method in class org.apache.spark.sql.DataFrameReader: Loads CSV files and returns the result as a DataFrame.
csv(String) - Method in class org.apache.spark.sql.DataFrameReader: Loads a CSV file and returns the result as a DataFrame.
csv(Dataset<String>) - Method in class org.apache.spark.sql.DataFrameReader: Loads an Dataset[String] storing CSV rows and returns the result as a DataFrame.
csv(Seq<String>) - Method in class org.apache.spark.sql.DataFrameReader: Loads CSV files and returns the result as a DataFrame.
csv(String) - Method in class org.apache.spark.sql.DataFrameWriter: Saves the content of the DataFrame in CSV format at the specified path.
csv(String) - Method in class org.apache.spark.sql.streaming.DataStreamReader: Loads a CSV file stream and returns the result as a DataFrame.
cube(Column...) - Method in class org.apache.spark.sql.Dataset: Create a multi-dimensional cube for the current Dataset using the specified columns, so we can run aggregation on them.
cube(String, String...) - Method in class org.apache.spark.sql.Dataset: Create a multi-dimensional cube for the current Dataset using the specified columns, so we can run aggregation on them.
cube(Seq<Column>) - Method in class org.apache.spark.sql.Dataset: Create a multi-dimensional cube for the current Dataset using the specified columns, so we can run aggregation on them.
cube(String, Seq<String>) - Method in class org.apache.spark.sql.Dataset: Create a multi-dimensional cube for the current Dataset using the specified columns, so we can run aggregation on them.
CubeType$() - Constructor for class org.apache.spark.sql.RelationalGroupedDataset.CubeType$
cume_dist() - Static method in class org.apache.spark.sql.functions: Window function: returns the cumulative distribution of values within a window partition, i.e.
current_date() - Static method in class org.apache.spark.sql.functions: Returns the current date as a date column.
current_timestamp() - Static method in class org.apache.spark.sql.functions: Returns the current timestamp as a timestamp column.
currentAttemptId() - Method in interface org.apache.spark.SparkStageInfo
currentAttemptId() - Method in class org.apache.spark.SparkStageInfoImpl
currentDatabase() - Method in class org.apache.spark.sql.catalog.Catalog: Returns the current default database in this session.
currentResult() - Method in interface org.apache.spark.partial.ApproximateEvaluator
currentRow() - Static method in class org.apache.spark.sql.expressions.Window: Value representing the current row.
currentRow() - Static method in class org.apache.spark.sql.functions: Deprecated.
Use Window.currentRow. Since 2.4.0.
currPrefLocs(Partition, RDD<?>) - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer
customMetrics() - Method in class org.apache.spark.sql.streaming.StateOperatorProgress

D

DAGSchedulerEvent - Interface in org.apache.spark.scheduler: Types of events that can be handled by the DAGScheduler.
dapply(Dataset<Row>, byte[], byte[], Object[], StructType) - Static method in class org.apache.spark.sql.api.r.SQLUtils: The helper function for dapply() on R side.
Data(Vector, double, Option<Object>) - Constructor for class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$.Data
Data(double[], double[], double[][]) - Constructor for class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV1_0$.Data
Data(double[], double[], double[][], String) - Constructor for class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV2_0$.Data
Data(int) - Constructor for class org.apache.spark.mllib.feature.ChiSqSelectorModel.SaveLoadV1_0$.Data
Data(Vector, double) - Constructor for class org.apache.spark.mllib.regression.impl.GLMRegressionModel.SaveLoadV1_0$.Data
data() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.LaunchTask
data() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate
Data$() - Constructor for class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$.Data$
Data$() - Constructor for class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV1_0$.Data$
Data$() - Constructor for class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV2_0$.Data$
Data$() - Constructor for class org.apache.spark.mllib.feature.ChiSqSelectorModel.SaveLoadV1_0$.Data$
Data$() - Constructor for class org.apache.spark.mllib.regression.impl.GLMRegressionModel.SaveLoadV1_0$.Data$
Database - Class in org.apache.spark.sql.catalog: A database in Spark, as returned by the listDatabases method defined in Catalog.
Database(String, String, String) - Constructor for class org.apache.spark.sql.catalog.Database
database() - Method in class org.apache.spark.sql.catalog.Function
database() - Method in class org.apache.spark.sql.catalog.Table
DATABASE_KEY - Static variable in class org.apache.spark.sql.sources.v2.DataSourceOptions: The option key for database name.
databaseExists(String) - Method in class org.apache.spark.sql.catalog.Catalog: Check if the database with the specified name exists.
databaseExists(String) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Return whether a table/view with the specified name exists.
databaseName() - Method in class org.apache.spark.sql.sources.v2.DataSourceOptions: Returns the value of the database name option.
databaseTypeDefinition() - Method in class org.apache.spark.sql.jdbc.JdbcType
dataDistribution() - Method in class org.apache.spark.status.api.v1.RDDStorageInfo
DATAFRAME_DAPPLY() - Static method in class org.apache.spark.api.r.RRunnerModes
DATAFRAME_GAPPLY() - Static method in class org.apache.spark.api.r.RRunnerModes
DataFrameNaFunctions - Class in org.apache.spark.sql: Functionality for working with missing data in DataFrames.
DataFrameReader - Class in org.apache.spark.sql: Interface used to load a Dataset from external storage systems (e.g.
DataFrameStatFunctions - Class in org.apache.spark.sql: Statistic functions for DataFrames.
DataFrameWriter<T> - Class in org.apache.spark.sql: Interface used to write a Dataset to external storage systems (e.g.
Dataset<T> - Class in org.apache.spark.sql: A Dataset is a strongly typed collection of domain-specific objects that can be transformed in parallel using functional or relational operations.
Dataset(SparkSession, LogicalPlan, Encoder<T>) - Constructor for class org.apache.spark.sql.Dataset
Dataset(SQLContext, LogicalPlan, Encoder<T>) - Constructor for class org.apache.spark.sql.Dataset
DatasetHolder<T> - Class in org.apache.spark.sql: A container for a Dataset, used for implicit conversions in Scala.
DatasetUtils - Class in org.apache.spark.ml.util
DatasetUtils() - Constructor for class org.apache.spark.ml.util.DatasetUtils
dataSource() - Method in interface org.apache.spark.ui.PagedTable
DataSourceOptions - Class in org.apache.spark.sql.sources.v2: An immutable string-to-string map in which keys are case-insensitive.
DataSourceOptions(Map<String, String>) - Constructor for class org.apache.spark.sql.sources.v2.DataSourceOptions
DataSourceReader - Interface in org.apache.spark.sql.sources.v2.reader: A data source reader that is returned by ReadSupport.createReader(DataSourceOptions) or ReadSupport.createReader(StructType, DataSourceOptions).
DataSourceRegister - Interface in org.apache.spark.sql.sources: Data sources should implement this trait so that they can register an alias to their data source.
DataSourceV2 - Interface in org.apache.spark.sql.sources.v2: The base interface for data source v2.
DataSourceWriter - Interface in org.apache.spark.sql.sources.v2.writer: A data source writer that is returned by WriteSupport.createWriter(String, StructType, SaveMode, DataSourceOptions)/ StreamWriteSupport.createStreamWriter( String, StructType, OutputMode, DataSourceOptions).
DataStreamReader - Class in org.apache.spark.sql.streaming: Interface used to load a streaming Dataset from external storage systems (e.g.
DataStreamWriter<T> - Class in org.apache.spark.sql.streaming: Interface used to write a streaming Dataset to external storage systems (e.g.
dataTablesHeaderNodes(HttpServletRequest) - Static method in class org.apache.spark.ui.UIUtils
dataType() - Method in class org.apache.spark.sql.catalog.Column
dataType() - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction: The DataType of the returned value of this UserDefinedAggregateFunction.
dataType() - Method in class org.apache.spark.sql.expressions.UserDefinedFunction
DataType - Class in org.apache.spark.sql.types: The base type of all Spark SQL data types.
DataType() - Constructor for class org.apache.spark.sql.types.DataType
dataType() - Method in class org.apache.spark.sql.types.StructField
dataType() - Method in class org.apache.spark.sql.vectorized.ColumnVector: Returns the data type of this column vector.
DataTypes - Class in org.apache.spark.sql.types: To get/create specific data type, users should use singleton objects and factory methods provided by this class.
DataTypes() - Constructor for class org.apache.spark.sql.types.DataTypes
DataValidators - Class in org.apache.spark.mllib.util: :: DeveloperApi :: A collection of methods used to validate data before applying ML algorithms.
DataValidators() - Constructor for class org.apache.spark.mllib.util.DataValidators
DataWriter<T> - Interface in org.apache.spark.sql.sources.v2.writer: A data writer returned by DataWriterFactory.createDataWriter(int, long, long) and is responsible for writing data for an input RDD partition.
DataWriterFactory<T> - Interface in org.apache.spark.sql.sources.v2.writer: A factory of DataWriter returned by DataSourceWriter.createWriterFactory(), which is responsible for creating and initializing the actual data writer at executor side.
date() - Method in class org.apache.spark.sql.ColumnName: Creates a new StructField of type date.
DATE() - Static method in class org.apache.spark.sql.Encoders: An encoder for nullable date type.
date_add(Column, int) - Static method in class org.apache.spark.sql.functions: Returns the date that is days days after start
date_format(Column, String) - Static method in class org.apache.spark.sql.functions: Converts a date/timestamp/string to a value of string in the format specified by the date format given by the second argument.
date_sub(Column, int) - Static method in class org.apache.spark.sql.functions: Returns the date that is days days before start
date_trunc(String, Column) - Static method in class org.apache.spark.sql.functions: Returns timestamp truncated to the unit specified by the format.
datediff(Column, Column) - Static method in class org.apache.spark.sql.functions: Returns the number of days from start to end.
DateType - Static variable in class org.apache.spark.sql.types.DataTypes: Gets the DateType object.
DateType - Class in org.apache.spark.sql.types: A date type, supporting "0001-01-01" through "9999-12-31".
DateType() - Constructor for class org.apache.spark.sql.types.DateType
dayofmonth(Column) - Static method in class org.apache.spark.sql.functions: Extracts the day of the month as an integer from a given date/timestamp/string.
dayofweek(Column) - Static method in class org.apache.spark.sql.functions: Extracts the day of the week as an integer from a given date/timestamp/string.
dayofyear(Column) - Static method in class org.apache.spark.sql.functions: Extracts the day of the year as an integer from a given date/timestamp/string.
DB2Dialect - Class in org.apache.spark.sql.jdbc
DB2Dialect() - Constructor for class org.apache.spark.sql.jdbc.DB2Dialect
DCT - Class in org.apache.spark.ml.feature: A feature transformer that takes the 1D discrete cosine transform of a real vector.
DCT(String) - Constructor for class org.apache.spark.ml.feature.DCT
DCT() - Constructor for class org.apache.spark.ml.feature.DCT
deallocate() - Method in class org.apache.spark.storage.ReadableChannelFileRegion
decayFactor() - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
decimal() - Method in class org.apache.spark.sql.ColumnName: Creates a new StructField of type decimal.
decimal(int, int) - Method in class org.apache.spark.sql.ColumnName: Creates a new StructField of type decimal.
DECIMAL() - Static method in class org.apache.spark.sql.Encoders: An encoder for nullable decimal type.
Decimal - Class in org.apache.spark.sql.types: A mutable implementation of BigDecimal that can hold a Long if values are small enough.
Decimal() - Constructor for class org.apache.spark.sql.types.Decimal
Decimal.DecimalAsIfIntegral$ - Class in org.apache.spark.sql.types: A Integral evidence parameter for Decimals.
Decimal.DecimalIsConflicted - Interface in org.apache.spark.sql.types: Common methods for Decimal evidence parameters
Decimal.DecimalIsFractional$ - Class in org.apache.spark.sql.types: A Fractional evidence parameter for Decimals.
DecimalAsIfIntegral$() - Constructor for class org.apache.spark.sql.types.Decimal.DecimalAsIfIntegral$
DecimalIsFractional$() - Constructor for class org.apache.spark.sql.types.Decimal.DecimalIsFractional$
DecimalType - Class in org.apache.spark.sql.types: The data type representing java.math.BigDecimal values.
DecimalType(int, int) - Constructor for class org.apache.spark.sql.types.DecimalType
DecimalType(int) - Constructor for class org.apache.spark.sql.types.DecimalType
DecimalType() - Constructor for class org.apache.spark.sql.types.DecimalType
DecimalType.Expression$ - Class in org.apache.spark.sql.types
DecimalType.Fixed$ - Class in org.apache.spark.sql.types
decimalTypeInfoToCatalyst(PrimitiveObjectInspector) - Method in interface org.apache.spark.sql.hive.HiveInspectors
DecisionTree - Class in org.apache.spark.mllib.tree: A class which implements a decision tree learning algorithm for classification and regression.
DecisionTree(Strategy) - Constructor for class org.apache.spark.mllib.tree.DecisionTree
DecisionTreeClassificationModel - Class in org.apache.spark.ml.classification: Decision tree model (http://en.wikipedia.org/wiki/Decision_tree_learning) for classification.
DecisionTreeClassifier - Class in org.apache.spark.ml.classification: Decision tree learning algorithm (http://en.wikipedia.org/wiki/Decision_tree_learning) for classification.
DecisionTreeClassifier(String) - Constructor for class org.apache.spark.ml.classification.DecisionTreeClassifier
DecisionTreeClassifier() - Constructor for class org.apache.spark.ml.classification.DecisionTreeClassifier
DecisionTreeClassifierParams - Interface in org.apache.spark.ml.tree
DecisionTreeModel - Interface in org.apache.spark.ml.tree: Abstraction for Decision Tree models.
DecisionTreeModel - Class in org.apache.spark.mllib.tree.model: Decision tree model for classification or regression.
DecisionTreeModel(Node, Enumeration.Value) - Constructor for class org.apache.spark.mllib.tree.model.DecisionTreeModel
DecisionTreeModel.SaveLoadV1_0$ - Class in org.apache.spark.mllib.tree.model
DecisionTreeModel.SaveLoadV1_0$.NodeData - Class in org.apache.spark.mllib.tree.model: Model data for model import/export
DecisionTreeModel.SaveLoadV1_0$.NodeData$ - Class in org.apache.spark.mllib.tree.model
DecisionTreeModel.SaveLoadV1_0$.PredictData - Class in org.apache.spark.mllib.tree.model
DecisionTreeModel.SaveLoadV1_0$.PredictData$ - Class in org.apache.spark.mllib.tree.model
DecisionTreeModel.SaveLoadV1_0$.SplitData - Class in org.apache.spark.mllib.tree.model
DecisionTreeModel.SaveLoadV1_0$.SplitData$ - Class in org.apache.spark.mllib.tree.model
DecisionTreeModelReadWrite - Class in org.apache.spark.ml.tree: Helper classes for tree model persistence
DecisionTreeModelReadWrite() - Constructor for class org.apache.spark.ml.tree.DecisionTreeModelReadWrite
DecisionTreeModelReadWrite.NodeData - Class in org.apache.spark.ml.tree: Info for a Node
DecisionTreeModelReadWrite.NodeData$ - Class in org.apache.spark.ml.tree
DecisionTreeModelReadWrite.SplitData - Class in org.apache.spark.ml.tree: Info for a Split
DecisionTreeModelReadWrite.SplitData$ - Class in org.apache.spark.ml.tree
DecisionTreeParams - Interface in org.apache.spark.ml.tree: Parameters for Decision Tree-based algorithms.
DecisionTreeRegressionModel - Class in org.apache.spark.ml.regression: Decision tree (Wikipedia) model for regression.
DecisionTreeRegressor - Class in org.apache.spark.ml.regression: Decision tree learning algorithm for regression.
DecisionTreeRegressor(String) - Constructor for class org.apache.spark.ml.regression.DecisionTreeRegressor
DecisionTreeRegressor() - Constructor for class org.apache.spark.ml.regression.DecisionTreeRegressor
DecisionTreeRegressorParams - Interface in org.apache.spark.ml.tree
decode(Column, String) - Static method in class org.apache.spark.sql.functions: Computes the first argument into a string from a binary using the provided character set (one of 'US-ASCII', 'ISO-8859-1', 'UTF-8', 'UTF-16BE', 'UTF-16LE', 'UTF-16').
decodeFileNameInURI(URI) - Static method in class org.apache.spark.util.Utils: Get the file name from uri's raw path and decode it.
decodeStructField(StructField, boolean) - Method in interface org.apache.spark.ml.attribute.AttributeFactory: Creates an Attribute from a StructField instance, optionally preserving name.
decodeURLParameter(String) - Static method in class org.apache.spark.ui.UIUtils: Decode URLParameter if URL is encoded by YARN-WebAppProxyServlet.
DEFAULT_CONNECTION_TIMEOUT() - Static method in class org.apache.spark.api.r.SparkRDefaults
DEFAULT_DRIVER_MEM_MB() - Static method in class org.apache.spark.util.Utils: Define a default value for driver memory here since this value is referenced across the code base and nearly all files already use Utils.scala
DEFAULT_HEARTBEAT_INTERVAL() - Static method in class org.apache.spark.api.r.SparkRDefaults
DEFAULT_MAX_FAILURES() - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
DEFAULT_MAX_TO_STRING_FIELDS() - Static method in class org.apache.spark.util.Utils: The performance overhead of creating and logging strings for wide schemas can be large.
DEFAULT_NUM_RBACKEND_THREADS() - Static method in class org.apache.spark.api.r.SparkRDefaults
DEFAULT_NUMBER_EXECUTORS() - Static method in class org.apache.spark.scheduler.cluster.SchedulerBackendUtils
DEFAULT_ROLLING_INTERVAL_SECS() - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
DEFAULT_SHUTDOWN_PRIORITY() - Static method in class org.apache.spark.util.ShutdownHookManager
defaultAttr() - Static method in class org.apache.spark.ml.attribute.BinaryAttribute: The default binary attribute.
defaultAttr() - Static method in class org.apache.spark.ml.attribute.NominalAttribute: The default nominal attribute.
defaultAttr() - Static method in class org.apache.spark.ml.attribute.NumericAttribute: The default numeric attribute.
defaultCopy(ParamMap) - Method in interface org.apache.spark.ml.param.Params: Default implementation of copy with extra params.
defaultCorrName() - Static method in class org.apache.spark.mllib.stat.correlation.CorrelationNames
DefaultCredentials - Class in org.apache.spark.streaming.kinesis: Returns DefaultAWSCredentialsProviderChain for authentication.
DefaultCredentials() - Constructor for class org.apache.spark.streaming.kinesis.DefaultCredentials
defaultLink() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Binomial$
defaultLink() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Gamma$
defaultLink() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Gaussian$
defaultLink() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Poisson$
defaultMinPartitions() - Method in class org.apache.spark.api.java.JavaSparkContext: Default min number of partitions for Hadoop RDDs when not given by user
defaultMinPartitions() - Method in class org.apache.spark.SparkContext: Default min number of partitions for Hadoop RDDs when not given by user Notice that we use math.min so the "defaultMinPartitions" cannot be higher than 2.
defaultParallelism() - Method in class org.apache.spark.api.java.JavaSparkContext: Default level of parallelism to use when not given by user (e.g.
defaultParallelism() - Method in interface org.apache.spark.scheduler.SchedulerBackend
defaultParallelism() - Method in interface org.apache.spark.scheduler.TaskScheduler
defaultParallelism() - Method in class org.apache.spark.SparkContext: Default level of parallelism to use when not given by user (e.g.
defaultParamMap() - Method in interface org.apache.spark.ml.param.Params: Internal param map for default values.
defaultParams(String) - Static method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy: Returns default configuration for the boosting algorithm
defaultParams(Enumeration.Value) - Static method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy: Returns default configuration for the boosting algorithm
DefaultParamsReadable<T> - Interface in org.apache.spark.ml.util: :: DeveloperApi ::
DefaultParamsWritable - Interface in org.apache.spark.ml.util: :: DeveloperApi ::
DefaultPartitionCoalescer - Class in org.apache.spark.rdd: Coalesce the partitions of a parent RDD (prev) into fewer partitions, so that each partition of this RDD computes one or more of the parent ones.
DefaultPartitionCoalescer(double) - Constructor for class org.apache.spark.rdd.DefaultPartitionCoalescer
DefaultPartitionCoalescer.PartitionLocations - Class in org.apache.spark.rdd
defaultPartitioner(RDD<?>, Seq<RDD<?>>) - Static method in class org.apache.spark.Partitioner: Choose a partitioner to use for a cogroup-like operation between a number of RDDs.
defaultSize() - Method in class org.apache.spark.sql.types.ArrayType: The default size of a value of the ArrayType is the default size of the element type.
defaultSize() - Method in class org.apache.spark.sql.types.BinaryType: The default size of a value of the BinaryType is 100 bytes.
defaultSize() - Method in class org.apache.spark.sql.types.BooleanType: The default size of a value of the BooleanType is 1 byte.
defaultSize() - Method in class org.apache.spark.sql.types.ByteType: The default size of a value of the ByteType is 1 byte.
defaultSize() - Method in class org.apache.spark.sql.types.CalendarIntervalType
defaultSize() - Method in class org.apache.spark.sql.types.DataType: The default size of a value of this data type, used internally for size estimation.
defaultSize() - Method in class org.apache.spark.sql.types.DateType: The default size of a value of the DateType is 4 bytes.
defaultSize() - Method in class org.apache.spark.sql.types.DecimalType: The default size of a value of the DecimalType is 8 bytes when precision is at most 18, and 16 bytes otherwise.
defaultSize() - Method in class org.apache.spark.sql.types.DoubleType: The default size of a value of the DoubleType is 8 bytes.
defaultSize() - Method in class org.apache.spark.sql.types.FloatType: The default size of a value of the FloatType is 4 bytes.
defaultSize() - Method in class org.apache.spark.sql.types.HiveStringType
defaultSize() - Method in class org.apache.spark.sql.types.IntegerType: The default size of a value of the IntegerType is 4 bytes.
defaultSize() - Method in class org.apache.spark.sql.types.LongType: The default size of a value of the LongType is 8 bytes.
defaultSize() - Method in class org.apache.spark.sql.types.MapType: The default size of a value of the MapType is (the default size of the key type + the default size of the value type).
defaultSize() - Method in class org.apache.spark.sql.types.NullType
defaultSize() - Method in class org.apache.spark.sql.types.ObjectType
defaultSize() - Method in class org.apache.spark.sql.types.ShortType: The default size of a value of the ShortType is 2 bytes.
defaultSize() - Method in class org.apache.spark.sql.types.StringType: The default size of a value of the StringType is 20 bytes.
defaultSize() - Method in class org.apache.spark.sql.types.StructType: The default size of a value of the StructType is the total default sizes of all field types.
defaultSize() - Method in class org.apache.spark.sql.types.TimestampType: The default size of a value of the TimestampType is 8 bytes.
defaultStrategy(String) - Static method in class org.apache.spark.mllib.tree.configuration.Strategy: Construct a default set of parameters for DecisionTree
defaultStrategy(Enumeration.Value) - Static method in class org.apache.spark.mllib.tree.configuration.Strategy: Construct a default set of parameters for DecisionTree
DefaultTopologyMapper - Class in org.apache.spark.storage: A TopologyMapper that assumes all nodes are in the same rack
DefaultTopologyMapper(SparkConf) - Constructor for class org.apache.spark.storage.DefaultTopologyMapper
defaultValue() - Method in class org.apache.spark.internal.config.ConfigEntryWithDefault
defaultValue() - Method in class org.apache.spark.internal.config.ConfigEntryWithDefaultFunction
defaultValue() - Method in class org.apache.spark.internal.config.ConfigEntryWithDefaultString
defaultValueString() - Method in class org.apache.spark.internal.config.ConfigEntryWithDefault
defaultValueString() - Method in class org.apache.spark.internal.config.ConfigEntryWithDefaultFunction
defaultValueString() - Method in class org.apache.spark.internal.config.ConfigEntryWithDefaultString
degree() - Method in class org.apache.spark.ml.feature.PolynomialExpansion: The polynomial degree to expand, which should be greater than equal to 1.
degrees() - Method in class org.apache.spark.graphx.GraphOps: The degree of each vertex in the graph.
degrees(Column) - Static method in class org.apache.spark.sql.functions: Converts an angle measured in radians to an approximately equivalent angle measured in degrees.
degrees(String) - Static method in class org.apache.spark.sql.functions: Converts an angle measured in radians to an approximately equivalent angle measured in degrees.
degreesOfFreedom() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionSummary: Degrees of freedom.
degreesOfFreedom() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary: Degrees of freedom
degreesOfFreedom() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
degreesOfFreedom() - Method in class org.apache.spark.mllib.stat.test.KolmogorovSmirnovTestResult
degreesOfFreedom() - Method in interface org.apache.spark.mllib.stat.test.TestResult: Returns the degree(s) of freedom of the hypothesis test.
delegate() - Method in class org.apache.spark.InterruptibleIterator
deleteCheckpointFiles() - Method in class org.apache.spark.ml.clustering.DistributedLDAModel: :: DeveloperApi ::
deleteExternalTmpPath(Configuration) - Method in interface org.apache.spark.sql.hive.execution.SaveAsHiveFile
deleteRecursively(File) - Static method in class org.apache.spark.util.Utils: Delete a file or directory and its contents recursively.
deleteWithJob(FileSystem, Path, boolean) - Method in class org.apache.spark.internal.io.FileCommitProtocol: Specifies that a file should be deleted with the commit of this job.
delimiterOptions() - Static method in class org.apache.spark.sql.hive.execution.HiveOptions
delta() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Tweedie$: Constant used in initialization and deviance to avoid numerical issues.
dense(int, int, double[]) - Static method in class org.apache.spark.ml.linalg.Matrices: Creates a column-major dense matrix.
dense(double, double...) - Static method in class org.apache.spark.ml.linalg.Vectors: Creates a dense vector from its values.
dense(double, Seq<Object>) - Static method in class org.apache.spark.ml.linalg.Vectors: Creates a dense vector from its values.
dense(double[]) - Static method in class org.apache.spark.ml.linalg.Vectors: Creates a dense vector from a double array.
dense(int, int, double[]) - Static method in class org.apache.spark.mllib.linalg.Matrices: Creates a column-major dense matrix.
dense(double, double...) - Static method in class org.apache.spark.mllib.linalg.Vectors: Creates a dense vector from its values.
dense(double, Seq<Object>) - Static method in class org.apache.spark.mllib.linalg.Vectors: Creates a dense vector from its values.
dense(double[]) - Static method in class org.apache.spark.mllib.linalg.Vectors: Creates a dense vector from a double array.
dense_rank() - Static method in class org.apache.spark.sql.functions: Window function: returns the rank of rows within a window partition, without any gaps.
DenseMatrix - Class in org.apache.spark.ml.linalg: Column-major dense matrix.
DenseMatrix(int, int, double[], boolean) - Constructor for class org.apache.spark.ml.linalg.DenseMatrix
DenseMatrix(int, int, double[]) - Constructor for class org.apache.spark.ml.linalg.DenseMatrix: Column-major dense matrix.
DenseMatrix - Class in org.apache.spark.mllib.linalg: Column-major dense matrix.
DenseMatrix(int, int, double[], boolean) - Constructor for class org.apache.spark.mllib.linalg.DenseMatrix
DenseMatrix(int, int, double[]) - Constructor for class org.apache.spark.mllib.linalg.DenseMatrix: Column-major dense matrix.
DenseVector - Class in org.apache.spark.ml.linalg: A dense vector represented by a value array.
DenseVector(double[]) - Constructor for class org.apache.spark.ml.linalg.DenseVector
DenseVector - Class in org.apache.spark.mllib.linalg: A dense vector represented by a value array.
DenseVector(double[]) - Constructor for class org.apache.spark.mllib.linalg.DenseVector
dependencies() - Method in class org.apache.spark.rdd.RDD: Get the list of dependencies of this RDD, taking into account whether the RDD is checkpointed or not.
dependencies() - Method in class org.apache.spark.streaming.dstream.DStream: List of parent DStreams on which this DStream depends on
dependencies() - Method in class org.apache.spark.streaming.dstream.InputDStream
Dependency<T> - Class in org.apache.spark: :: DeveloperApi :: Base class for dependencies.
Dependency() - Constructor for class org.apache.spark.Dependency
DEPLOY_MODE - Static variable in class org.apache.spark.launcher.SparkLauncher: The Spark deploy mode.
deployMode() - Method in class org.apache.spark.SparkContext
depth() - Method in interface org.apache.spark.ml.tree.DecisionTreeModel: Depth of the tree.
depth() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel: Get depth of tree.
depth() - Method in class org.apache.spark.util.sketch.CountMinSketch: Depth of this CountMinSketch.
DerbyDialect - Class in org.apache.spark.sql.jdbc
DerbyDialect() - Constructor for class org.apache.spark.sql.jdbc.DerbyDialect
deriv(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.CLogLog$
deriv(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Identity$
deriv(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Inverse$
deriv(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Log$
deriv(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Logit$
deriv(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Probit$
deriv(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Sqrt$
derivative() - Method in interface org.apache.spark.ml.ann.ActivationFunction: Implements a derivative of a function (needed for the back propagation)
desc() - Method in class org.apache.spark.serializer.SerializationDebugger.ObjectStreamClassMethods
desc() - Method in class org.apache.spark.sql.Column: Returns a sort expression based on the descending order of the column.
desc(String) - Static method in class org.apache.spark.sql.functions: Returns a sort expression based on the descending order of the column.
desc() - Method in class org.apache.spark.util.MethodIdentifier
desc_nulls_first() - Method in class org.apache.spark.sql.Column: Returns a sort expression based on the descending order of the column, and null values appear before non-null values.
desc_nulls_first(String) - Static method in class org.apache.spark.sql.functions: Returns a sort expression based on the descending order of the column, and null values appear before non-null values.
desc_nulls_last() - Method in class org.apache.spark.sql.Column: Returns a sort expression based on the descending order of the column, and null values appear after non-null values.
desc_nulls_last(String) - Static method in class org.apache.spark.sql.functions: Returns a sort expression based on the descending order of the column, and null values appear after non-null values.
describe(String...) - Method in class org.apache.spark.sql.Dataset: Computes basic statistics for numeric and string columns, including count, mean, stddev, min, and max.
describe(Seq<String>) - Method in class org.apache.spark.sql.Dataset: Computes basic statistics for numeric and string columns, including count, mean, stddev, min, and max.
describeTopics(int) - Method in class org.apache.spark.ml.clustering.LDAModel: Return the topics described by their top-weighted terms.
describeTopics() - Method in class org.apache.spark.ml.clustering.LDAModel
describeTopics(int) - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
describeTopics(int) - Method in class org.apache.spark.mllib.clustering.LDAModel: Return the topics described by weighted terms.
describeTopics() - Method in class org.apache.spark.mllib.clustering.LDAModel: Return the topics described by weighted terms.
describeTopics(int) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
description() - Method in class org.apache.spark.ExceptionFailure
description() - Method in class org.apache.spark.sql.catalog.Column
description() - Method in class org.apache.spark.sql.catalog.Database
description() - Method in class org.apache.spark.sql.catalog.Function
description() - Method in class org.apache.spark.sql.catalog.Table
description() - Method in class org.apache.spark.sql.streaming.SinkProgress
description() - Method in class org.apache.spark.sql.streaming.SourceProgress
description() - Method in class org.apache.spark.status.api.v1.JobData
description() - Method in class org.apache.spark.status.api.v1.StageData
description() - Method in class org.apache.spark.status.api.v1.streaming.OutputOperationInfo
description() - Method in class org.apache.spark.status.LiveStage
description() - Method in class org.apache.spark.storage.StorageLevel
description() - Method in class org.apache.spark.streaming.scheduler.OutputOperationInfo
DESER_CPU_TIME() - Static method in class org.apache.spark.status.TaskIndexNames
DESER_TIME() - Static method in class org.apache.spark.status.TaskIndexNames
DeserializationStream - Class in org.apache.spark.serializer: :: DeveloperApi :: A stream for reading serialized objects.
DeserializationStream() - Constructor for class org.apache.spark.serializer.DeserializationStream
deserialize(Object) - Method in class org.apache.spark.mllib.linalg.VectorUDT
deserialize(ByteBuffer, ClassLoader, ClassTag<T>) - Method in class org.apache.spark.serializer.DummySerializerInstance
deserialize(ByteBuffer, ClassTag<T>) - Method in class org.apache.spark.serializer.DummySerializerInstance
deserialize(ByteBuffer, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializerInstance
deserialize(ByteBuffer, ClassLoader, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializerInstance
deserialize(byte[]) - Static method in class org.apache.spark.util.Utils: Deserialize an object using Java serialization
deserialize(byte[], ClassLoader) - Static method in class org.apache.spark.util.Utils: Deserialize an object using Java serialization and the given ClassLoader
deserialized() - Method in class org.apache.spark.storage.StorageLevel
DeserializedMemoryEntry<T> - Class in org.apache.spark.storage.memory
DeserializedMemoryEntry(Object, long, ClassTag<T>) - Constructor for class org.apache.spark.storage.memory.DeserializedMemoryEntry
DeserializedValuesHolder<T> - Class in org.apache.spark.storage.memory: A holder for storing the deserialized values.
DeserializedValuesHolder(ClassTag<T>) - Constructor for class org.apache.spark.storage.memory.DeserializedValuesHolder
deserializeLongValue(byte[]) - Static method in class org.apache.spark.util.Utils: Deserialize a Long value (used for PythonPartitioner)
deserializeOffset(String) - Method in interface org.apache.spark.sql.sources.v2.reader.streaming.ContinuousReader: Deserialize a JSON string into an Offset of the implementation-defined offset type.
deserializeOffset(String) - Method in interface org.apache.spark.sql.sources.v2.reader.streaming.MicroBatchReader: Deserialize a JSON string into an Offset of the implementation-defined offset type.
DeserializerLock - Class in org.apache.spark.sql.hive: Object to synchronize on when calling org.apache.hadoop.hive.serde2.Deserializer#initialize.
DeserializerLock() - Constructor for class org.apache.spark.sql.hive.DeserializerLock
deserializeStream(InputStream) - Method in class org.apache.spark.serializer.DummySerializerInstance
deserializeStream(InputStream) - Method in class org.apache.spark.serializer.SerializerInstance
deserializeViaNestedStream(InputStream, SerializerInstance, Function1<DeserializationStream, BoxedUnit>) - Static method in class org.apache.spark.util.Utils: Deserialize via nested stream using specific serializer
destroy() - Method in class org.apache.spark.broadcast.Broadcast: Destroy all data and metadata related to this broadcast variable.
details() - Method in class org.apache.spark.scheduler.StageInfo
details() - Method in class org.apache.spark.status.api.v1.StageData
DETERMINATE() - Static method in class org.apache.spark.rdd.DeterministicLevel
determineBounds(ArrayBuffer<Tuple2<K, Object>>, int, Ordering<K>, ClassTag<K>) - Static method in class org.apache.spark.RangePartitioner: Determines the bounds for range partitioning from candidates with weights indicating how many items each represents.
DetermineTableStats - Class in org.apache.spark.sql.hive
DetermineTableStats(SparkSession) - Constructor for class org.apache.spark.sql.hive.DetermineTableStats
deterministic() - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction: Returns true iff this function is deterministic, i.e.
deterministic() - Method in class org.apache.spark.sql.expressions.UserDefinedFunction: Returns true iff the UDF is deterministic, i.e.
DeterministicLevel - Class in org.apache.spark.rdd: The deterministic level of RDD's output (i.e.
DeterministicLevel() - Constructor for class org.apache.spark.rdd.DeterministicLevel
deviance(double, double, double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Binomial$
deviance(double, double, double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Gamma$
deviance(double, double, double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Gaussian$
deviance(double, double, double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Poisson$
deviance() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionSummary: The deviance for the fitted model.
devianceResiduals() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary: The weighted residuals, the usual residuals rescaled by the square root of the instance weights.
dfToCols(Dataset<Row>) - Static method in class org.apache.spark.sql.api.r.SQLUtils
dfToRowRDD(Dataset<Row>) - Static method in class org.apache.spark.sql.api.r.SQLUtils
dgemm(double, DenseMatrix<Object>, DenseMatrix<Object>, double, DenseMatrix<Object>) - Static method in class org.apache.spark.ml.ann.BreezeUtil: DGEMM: C := alpha * A * B + beta * C
dgemv(double, DenseMatrix<Object>, DenseVector<Object>, double, DenseVector<Object>) - Static method in class org.apache.spark.ml.ann.BreezeUtil: DGEMV: y := alpha * A * x + beta * y
diag(Vector) - Static method in class org.apache.spark.ml.linalg.DenseMatrix: Generate a diagonal matrix in DenseMatrix format from the supplied values.
diag(Vector) - Static method in class org.apache.spark.ml.linalg.Matrices: Generate a diagonal matrix in Matrix format from the supplied values.
diag(Vector) - Static method in class org.apache.spark.mllib.linalg.DenseMatrix: Generate a diagonal matrix in DenseMatrix format from the supplied values.
diag(Vector) - Static method in class org.apache.spark.mllib.linalg.Matrices: Generate a diagonal matrix in Matrix format from the supplied values.
diff(RDD<Tuple2<Object, VD>>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
diff(VertexRDD<VD>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
diff(RDD<Tuple2<Object, VD>>) - Method in class org.apache.spark.graphx.VertexRDD: For each vertex present in both this and other, diff returns only those vertices with differing values; for values that are different, keeps the values from other.
diff(VertexRDD<VD>) - Method in class org.apache.spark.graphx.VertexRDD: For each vertex present in both this and other, diff returns only those vertices with differing values; for values that are different, keeps the values from other.
DifferentiableLossAggregator<Datum,Agg extends DifferentiableLossAggregator<Datum,Agg>> - Interface in org.apache.spark.ml.optim.aggregator: A parent trait for aggregators used in fitting MLlib models.
DifferentiableRegularization<T> - Interface in org.apache.spark.ml.optim.loss: A Breeze diff function which represents a cost function for differentiable regularization of parameters.
dim() - Method in interface org.apache.spark.ml.optim.aggregator.DifferentiableLossAggregator: The dimension of the gradient array.
dir() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
directory(File) - Method in class org.apache.spark.launcher.SparkLauncher: Sets the working directory of spark-submit.
disableOutputSpecValidation() - Static method in class org.apache.spark.internal.io.SparkHadoopWriterUtils: Allows for the spark.hadoop.validateOutputSpecs checks to be disabled on a case-by-case basis; see SPARK-4835 for more details.
disconnect() - Method in interface org.apache.spark.launcher.SparkAppHandle: Disconnects the handle from the application, without stopping it.
DISK_BYTES_SPILLED() - Static method in class org.apache.spark.InternalAccumulator
DISK_ONLY - Static variable in class org.apache.spark.api.java.StorageLevels
DISK_ONLY() - Static method in class org.apache.spark.storage.StorageLevel
DISK_ONLY_2 - Static variable in class org.apache.spark.api.java.StorageLevels
DISK_ONLY_2() - Static method in class org.apache.spark.storage.StorageLevel
DISK_SPILL() - Static method in class org.apache.spark.status.TaskIndexNames
DiskBlockData - Class in org.apache.spark.storage
DiskBlockData(long, long, File, long) - Constructor for class org.apache.spark.storage.DiskBlockData
diskBytesSpilled() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
diskBytesSpilled() - Method in class org.apache.spark.status.api.v1.StageData
diskBytesSpilled() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
diskBytesSpilled() - Method in class org.apache.spark.status.api.v1.TaskMetrics
diskSize() - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
diskSize() - Method in class org.apache.spark.storage.BlockStatus
diskSize() - Method in class org.apache.spark.storage.BlockUpdatedInfo
diskSize() - Method in class org.apache.spark.storage.RDDInfo
diskUsed() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
diskUsed() - Method in class org.apache.spark.status.api.v1.RDDDataDistribution
diskUsed() - Method in class org.apache.spark.status.api.v1.RDDPartitionInfo
diskUsed() - Method in class org.apache.spark.status.api.v1.RDDStorageInfo
diskUsed() - Method in class org.apache.spark.status.LiveExecutor
diskUsed() - Method in class org.apache.spark.status.LiveRDD
diskUsed() - Method in class org.apache.spark.status.LiveRDDDistribution
diskUsed() - Method in class org.apache.spark.status.LiveRDDPartition
dispersion() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionSummary: The dispersion of the fitted model.
dispose() - Method in interface org.apache.spark.storage.BlockData
dispose() - Method in class org.apache.spark.storage.DiskBlockData
dispose(ByteBuffer) - Static method in class org.apache.spark.storage.StorageUtils: Attempt to clean up a ByteBuffer if it is direct or memory-mapped.
distanceMeasure() - Method in class org.apache.spark.ml.evaluation.ClusteringEvaluator: param for distance measure to be used in evaluation (supports "squaredEuclidean" (default), "cosine")
distanceMeasure() - Method in interface org.apache.spark.ml.param.shared.HasDistanceMeasure: Param for The distance measure.
distanceMeasure() - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel
distanceMeasure() - Method in class org.apache.spark.mllib.clustering.KMeansModel
distinct() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a new RDD containing the distinct elements in this RDD.
distinct(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a new RDD containing the distinct elements in this RDD.
distinct() - Method in class org.apache.spark.api.java.JavaPairRDD: Return a new RDD containing the distinct elements in this RDD.
distinct(int) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a new RDD containing the distinct elements in this RDD.
distinct() - Method in class org.apache.spark.api.java.JavaRDD: Return a new RDD containing the distinct elements in this RDD.
distinct(int) - Method in class org.apache.spark.api.java.JavaRDD: Return a new RDD containing the distinct elements in this RDD.
distinct(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Return a new RDD containing the distinct elements in this RDD.
distinct() - Method in class org.apache.spark.rdd.RDD: Return a new RDD containing the distinct elements in this RDD.
distinct() - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset that contains only the unique rows from this Dataset.
distinct(Column...) - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction: Creates a Column for this UDAF using the distinct values of the given Columns as input arguments.
distinct(Seq<Column>) - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction: Creates a Column for this UDAF using the distinct values of the given Columns as input arguments.
DistributedLDAModel - Class in org.apache.spark.ml.clustering: Distributed model fitted by LDA.
DistributedLDAModel - Class in org.apache.spark.mllib.clustering: Distributed LDA model.
DistributedMatrix - Interface in org.apache.spark.mllib.linalg.distributed: Represents a distributively stored matrix backed by one or more RDDs.
Distribution - Interface in org.apache.spark.sql.sources.v2.reader.partitioning: An interface to represent data distribution requirement, which specifies how the records should be distributed among the data partitions (one InputPartitionReader outputs data for one partition).
distribution(LiveExecutor) - Method in class org.apache.spark.status.LiveRDD
distributionOpt(LiveExecutor) - Method in class org.apache.spark.status.LiveRDD
div(Decimal, Decimal) - Method in class org.apache.spark.sql.types.Decimal.DecimalIsFractional$
div(Duration) - Method in class org.apache.spark.streaming.Duration
divide(Object) - Method in class org.apache.spark.sql.Column: Division this expression by another expression.
doc() - Method in class org.apache.spark.ml.param.Param
docConcentration() - Method in interface org.apache.spark.ml.clustering.LDAParams: Concentration parameter (commonly named "alpha") for the prior placed on documents' distributions over topics ("theta").
docConcentration() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
docConcentration() - Method in class org.apache.spark.mllib.clustering.LDAModel: Concentration parameter (commonly named "alpha") for the prior placed on documents' distributions over topics ("theta").
docConcentration() - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
DocumentFrequencyAggregator(int) - Constructor for class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
DocumentFrequencyAggregator() - Constructor for class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
doesDirectoryContainAnyNewFiles(File, long) - Static method in class org.apache.spark.util.Utils: Determines if a directory contains any files newer than cutoff seconds.
doFetchFile(String, File, String, SparkConf, org.apache.spark.SecurityManager, Configuration) - Static method in class org.apache.spark.util.Utils: Download a file or directory to target directory.
doPostEvent(SparkListenerInterface, SparkListenerEvent) - Method in interface org.apache.spark.scheduler.SparkListenerBus
doPostEvent(L, E) - Method in interface org.apache.spark.util.ListenerBus: Post an event to the specified listener.
Dot - Class in org.apache.spark.ml.feature
Dot() - Constructor for class org.apache.spark.ml.feature.Dot
dot(Vector, Vector) - Static method in class org.apache.spark.ml.linalg.BLAS: dot(x, y)
dot(Vector, Vector) - Static method in class org.apache.spark.mllib.linalg.BLAS: dot(x, y)
doTest(DStream<Tuple2<StatCounter, StatCounter>>) - Method in interface org.apache.spark.mllib.stat.test.StreamingTestMethod: Perform streaming 2-sample statistical significance testing.
doTest(DStream<Tuple2<StatCounter, StatCounter>>) - Static method in class org.apache.spark.mllib.stat.test.StudentTTest
doTest(DStream<Tuple2<StatCounter, StatCounter>>) - Static method in class org.apache.spark.mllib.stat.test.WelchTTest
DOUBLE() - Static method in class org.apache.spark.sql.Encoders: An encoder for nullable double type.
doubleAccumulator(double) - Method in class org.apache.spark.api.java.JavaSparkContext: Deprecated.
use sc().doubleAccumulator(). Since 2.0.0.
doubleAccumulator(double, String) - Method in class org.apache.spark.api.java.JavaSparkContext: Deprecated.
use sc().doubleAccumulator(String). Since 2.0.0.
doubleAccumulator() - Method in class org.apache.spark.SparkContext: Create and register a double accumulator, which starts with 0 and accumulates inputs by add.
doubleAccumulator(String) - Method in class org.apache.spark.SparkContext: Create and register a double accumulator, which starts with 0 and accumulates inputs by add.
DoubleAccumulator - Class in org.apache.spark.util: An accumulator for computing sum, count, and averages for double precision floating numbers.
DoubleAccumulator() - Constructor for class org.apache.spark.util.DoubleAccumulator
DoubleAccumulatorParam$() - Constructor for class org.apache.spark.AccumulatorParam.DoubleAccumulatorParam$: Deprecated.
DoubleArrayArrayParam - Class in org.apache.spark.ml.param: :: DeveloperApi :: Specialized version of Param[Array[Array[Double}] for Java.
DoubleArrayArrayParam(Params, String, String, Function1<double[][], Object>) - Constructor for class org.apache.spark.ml.param.DoubleArrayArrayParam
DoubleArrayArrayParam(Params, String, String) - Constructor for class org.apache.spark.ml.param.DoubleArrayArrayParam
DoubleArrayParam - Class in org.apache.spark.ml.param: :: DeveloperApi :: Specialized version of Param[Array[Double} for Java.
DoubleArrayParam(Params, String, String, Function1<double[], Object>) - Constructor for class org.apache.spark.ml.param.DoubleArrayParam
DoubleArrayParam(Params, String, String) - Constructor for class org.apache.spark.ml.param.DoubleArrayParam
DoubleFlatMapFunction<T> - Interface in org.apache.spark.api.java.function: A function that returns zero or more records of type Double from each input record.
DoubleFunction<T> - Interface in org.apache.spark.api.java.function: A function that returns Doubles, and can be used to construct DoubleRDDs.
DoubleParam - Class in org.apache.spark.ml.param: :: DeveloperApi :: Specialized version of Param[Double] for Java.
DoubleParam(String, String, String, Function1<Object, Object>) - Constructor for class org.apache.spark.ml.param.DoubleParam
DoubleParam(String, String, String) - Constructor for class org.apache.spark.ml.param.DoubleParam
DoubleParam(Identifiable, String, String, Function1<Object, Object>) - Constructor for class org.apache.spark.ml.param.DoubleParam
DoubleParam(Identifiable, String, String) - Constructor for class org.apache.spark.ml.param.DoubleParam
DoubleRDDFunctions - Class in org.apache.spark.rdd: Extra functions available on RDDs of Doubles through an implicit conversion.
DoubleRDDFunctions(RDD<Object>) - Constructor for class org.apache.spark.rdd.DoubleRDDFunctions
doubleRDDToDoubleRDDFunctions(RDD<Object>) - Static method in class org.apache.spark.rdd.RDD
DoubleType - Static variable in class org.apache.spark.sql.types.DataTypes: Gets the DoubleType object.
DoubleType - Class in org.apache.spark.sql.types: The data type representing Double values.
DoubleType() - Constructor for class org.apache.spark.sql.types.DoubleType
driver() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.SetupDriver
DRIVER_EXTRA_CLASSPATH - Static variable in class org.apache.spark.launcher.SparkLauncher: Configuration key for the driver class path.
DRIVER_EXTRA_JAVA_OPTIONS - Static variable in class org.apache.spark.launcher.SparkLauncher: Configuration key for the driver VM options.
DRIVER_EXTRA_LIBRARY_PATH - Static variable in class org.apache.spark.launcher.SparkLauncher: Configuration key for the driver native library path.
DRIVER_MEMORY - Static variable in class org.apache.spark.launcher.SparkLauncher: Configuration key for the driver memory.
DRIVER_WAL_BATCHING_CONF_KEY() - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
DRIVER_WAL_BATCHING_TIMEOUT_CONF_KEY() - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
DRIVER_WAL_CLASS_CONF_KEY() - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
DRIVER_WAL_CLOSE_AFTER_WRITE_CONF_KEY() - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
DRIVER_WAL_MAX_FAILURES_CONF_KEY() - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
DRIVER_WAL_ROLLING_INTERVAL_CONF_KEY() - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
driverLogs() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
drop() - Method in class org.apache.spark.sql.DataFrameNaFunctions: Returns a new DataFrame that drops rows containing any null or NaN values.
drop(String) - Method in class org.apache.spark.sql.DataFrameNaFunctions: Returns a new DataFrame that drops rows containing null or NaN values.
drop(String[]) - Method in class org.apache.spark.sql.DataFrameNaFunctions: Returns a new DataFrame that drops rows containing any null or NaN values in the specified columns.
drop(Seq<String>) - Method in class org.apache.spark.sql.DataFrameNaFunctions: (Scala-specific) Returns a new DataFrame that drops rows containing any null or NaN values in the specified columns.
drop(String, String[]) - Method in class org.apache.spark.sql.DataFrameNaFunctions: Returns a new DataFrame that drops rows containing null or NaN values in the specified columns.
drop(String, Seq<String>) - Method in class org.apache.spark.sql.DataFrameNaFunctions: (Scala-specific) Returns a new DataFrame that drops rows containing null or NaN values in the specified columns.
drop(int) - Method in class org.apache.spark.sql.DataFrameNaFunctions: Returns a new DataFrame that drops rows containing less than minNonNulls non-null and non-NaN values.
drop(int, String[]) - Method in class org.apache.spark.sql.DataFrameNaFunctions: Returns a new DataFrame that drops rows containing less than minNonNulls non-null and non-NaN values in the specified columns.
drop(int, Seq<String>) - Method in class org.apache.spark.sql.DataFrameNaFunctions: (Scala-specific) Returns a new DataFrame that drops rows containing less than minNonNulls non-null and non-NaN values in the specified columns.
drop(String...) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset with columns dropped.
drop(String) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset with a column dropped.
drop(Seq<String>) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset with columns dropped.
drop(Column) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset with a column dropped.
dropDatabase(String, boolean, boolean) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Drop the specified database, if it exists.
dropDuplicates(String, String...) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset with duplicate rows removed, considering only the subset of columns.
dropDuplicates() - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset that contains only the unique rows from this Dataset.
dropDuplicates(Seq<String>) - Method in class org.apache.spark.sql.Dataset: (Scala-specific) Returns a new Dataset with duplicate rows removed, considering only the subset of columns.
dropDuplicates(String[]) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset with duplicate rows removed, considering only the subset of columns.
dropDuplicates(String, Seq<String>) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset with duplicate rows removed, considering only the subset of columns.
dropFromMemory(BlockId, Function0<Either<Object, org.apache.spark.util.io.ChunkedByteBuffer>>, ClassTag<T>) - Method in interface org.apache.spark.storage.memory.BlockEvictionHandler: Drop a block from memory, possibly putting it on disk if applicable.
dropFunction(String, String) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Drop an existing function in the database.
dropGlobalTempView(String) - Method in class org.apache.spark.sql.catalog.Catalog: Drops the global temporary view with the given view name in the catalog.
dropLast() - Method in class org.apache.spark.ml.feature.OneHotEncoder: Deprecated.

Whether to drop the last category in the encoded vector (default: true)
dropLast() - Method in interface org.apache.spark.ml.feature.OneHotEncoderBase: Whether to drop the last category in the encoded vector (default: true)
dropPartitions(String, String, Seq<Map<String, String>>, boolean, boolean, boolean) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Drop one or many partitions in the given table, assuming they exist.
dropTable(String, String, boolean, boolean) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Drop the specified table.
dropTempTable(String) - Method in class org.apache.spark.sql.SQLContext
dropTempView(String) - Method in class org.apache.spark.sql.catalog.Catalog: Drops the local temporary view with the given view name in the catalog.
dspmv(int, double, DenseVector, DenseVector, double, DenseVector) - Static method in class org.apache.spark.ml.linalg.BLAS: y := alpha*A*x + beta*y
Dst - Static variable in class org.apache.spark.graphx.TripletFields: Expose the destination and edge fields but not the source field.
dstAttr() - Method in class org.apache.spark.graphx.EdgeContext: The vertex attribute of the edge's destination vertex.
dstAttr() - Method in class org.apache.spark.graphx.EdgeTriplet: The destination vertex attribute
dstAttr() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
dstCol() - Method in interface org.apache.spark.ml.clustering.PowerIterationClusteringParams: Name of the input column for destination vertex IDs.
dstId() - Method in class org.apache.spark.graphx.Edge
dstId() - Method in class org.apache.spark.graphx.EdgeContext: The vertex id of the edge's destination vertex.
dstId() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
dstream() - Method in class org.apache.spark.streaming.api.java.JavaDStream
dstream() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
dstream() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
DStream<T> - Class in org.apache.spark.streaming.dstream: A Discretized Stream (DStream), the basic abstraction in Spark Streaming, is a continuous sequence of RDDs (of the same type) representing a continuous stream of data (see org.apache.spark.rdd.RDD in the Spark core documentation for more details on RDDs).
DStream(StreamingContext, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.DStream
dtypes() - Method in class org.apache.spark.sql.Dataset: Returns all column names and their data types as an array.
DummySerializerInstance - Class in org.apache.spark.serializer: Unfortunately, we need a serializer instance in order to construct a DiskBlockObjectWriter.
duration() - Method in class org.apache.spark.scheduler.TaskInfo
duration() - Method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
duration() - Method in class org.apache.spark.status.api.v1.streaming.OutputOperationInfo
duration() - Method in class org.apache.spark.status.api.v1.TaskData
DURATION() - Static method in class org.apache.spark.status.TaskIndexNames
Duration - Class in org.apache.spark.streaming
Duration(long) - Constructor for class org.apache.spark.streaming.Duration
duration() - Method in class org.apache.spark.streaming.scheduler.OutputOperationInfo: Return the duration of this output operation.
durationMs() - Method in class org.apache.spark.sql.streaming.StreamingQueryProgress
Durations - Class in org.apache.spark.streaming
Durations() - Constructor for class org.apache.spark.streaming.Durations

E

Edge<ED> - Class in org.apache.spark.graphx: A single directed edge consisting of a source id, target id, and the data associated with the edge.
Edge(long, long, ED) - Constructor for class org.apache.spark.graphx.Edge
EdgeActiveness - Enum in org.apache.spark.graphx.impl: Criteria for filtering edges based on activeness.
EdgeContext<VD,ED,A> - Class in org.apache.spark.graphx: Represents an edge along with its neighboring vertices and allows sending messages along the edge.
EdgeContext() - Constructor for class org.apache.spark.graphx.EdgeContext
EdgeDirection - Class in org.apache.spark.graphx: The direction of a directed edge relative to a vertex.
EdgeDirection() - Constructor for class org.apache.spark.graphx.EdgeDirection
edgeListFile(SparkContext, String, boolean, int, StorageLevel, StorageLevel) - Static method in class org.apache.spark.graphx.GraphLoader: Loads a graph from an edge list formatted file where each line contains two integers: a source id and a target id.
EdgeOnly - Static variable in class org.apache.spark.graphx.TripletFields: Expose only the edge field and not the source or destination field.
EdgePartition1D$() - Constructor for class org.apache.spark.graphx.PartitionStrategy.EdgePartition1D$
EdgePartition2D$() - Constructor for class org.apache.spark.graphx.PartitionStrategy.EdgePartition2D$
EdgeRDD<ED> - Class in org.apache.spark.graphx: EdgeRDD[ED, VD] extends RDD[Edge[ED} by storing the edges in columnar format on each partition for performance.
EdgeRDD(SparkContext, Seq<Dependency<?>>) - Constructor for class org.apache.spark.graphx.EdgeRDD
EdgeRDDImpl<ED,VD> - Class in org.apache.spark.graphx.impl
edges() - Method in class org.apache.spark.graphx.Graph: An RDD containing the edges and their associated attributes.
edges() - Method in class org.apache.spark.graphx.impl.GraphImpl
EdgeTriplet<VD,ED> - Class in org.apache.spark.graphx: An edge triplet represents an edge along with the vertex attributes of its neighboring vertices.
EdgeTriplet() - Constructor for class org.apache.spark.graphx.EdgeTriplet
EigenValueDecomposition - Class in org.apache.spark.mllib.linalg: Compute eigen-decomposition.
EigenValueDecomposition() - Constructor for class org.apache.spark.mllib.linalg.EigenValueDecomposition
Either() - Static method in class org.apache.spark.graphx.EdgeDirection: Edges originating from *or* arriving at a vertex of interest.
elasticNetParam() - Method in interface org.apache.spark.ml.param.shared.HasElasticNetParam: Param for the ElasticNet mixing parameter, in range [0, 1].
elem(String, Function1<Object, Object>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
elem(Parsers) - Static method in class org.apache.spark.ml.feature.RFormulaParser
element_at(Column, Object) - Static method in class org.apache.spark.sql.functions: Returns element of array at given index in value if column is array.
elementType() - Method in class org.apache.spark.sql.types.ArrayType
ElementwiseProduct - Class in org.apache.spark.ml.feature: Outputs the Hadamard product (i.e., the element-wise product) of each input vector with a provided "weight" vector.
ElementwiseProduct(String) - Constructor for class org.apache.spark.ml.feature.ElementwiseProduct
ElementwiseProduct() - Constructor for class org.apache.spark.ml.feature.ElementwiseProduct
ElementwiseProduct - Class in org.apache.spark.mllib.feature: Outputs the Hadamard product (i.e., the element-wise product) of each input vector with a provided "weight" vector.
ElementwiseProduct(Vector) - Constructor for class org.apache.spark.mllib.feature.ElementwiseProduct
elems() - Method in class org.apache.spark.status.api.v1.StackTrace
EMLDAOptimizer - Class in org.apache.spark.mllib.clustering: :: DeveloperApi ::
EMLDAOptimizer() - Constructor for class org.apache.spark.mllib.clustering.EMLDAOptimizer
empty() - Static method in class org.apache.spark.api.java.Optional
empty() - Static method in class org.apache.spark.ml.param.ParamMap: Returns an empty param map.
empty() - Method in class org.apache.spark.mllib.fpm.PrefixSpan.Prefix$: An empty Prefix instance.
empty() - Static method in class org.apache.spark.sql.sources.v2.DataSourceOptions
empty() - Static method in class org.apache.spark.sql.types.Metadata: Returns an empty Metadata.
empty() - Static method in class org.apache.spark.storage.BlockStatus
EMPTY_USER_GROUPS() - Static method in class org.apache.spark.util.Utils
emptyDataFrame() - Method in class org.apache.spark.sql.SparkSession: Returns a DataFrame with no rows or columns.
emptyDataFrame() - Method in class org.apache.spark.sql.SQLContext: Returns a DataFrame with no rows or columns.
emptyDataset(Encoder<T>) - Method in class org.apache.spark.sql.SparkSession: :: Experimental :: Creates a new Dataset of type T containing zero elements.
emptyNode(int) - Static method in class org.apache.spark.mllib.tree.model.Node: Return a node with the given node id (but nothing else set).
emptyRDD() - Method in class org.apache.spark.api.java.JavaSparkContext: Get an RDD that has no partitions or elements.
emptyRDD(ClassTag<T>) - Method in class org.apache.spark.SparkContext: Get an RDD that has no partitions or elements.
EmptyTaskCommitMessage$() - Constructor for class org.apache.spark.internal.io.FileCommitProtocol.EmptyTaskCommitMessage$
enableBatchRead() - Method in interface org.apache.spark.sql.sources.v2.reader.SupportsScanColumnarBatch: Returns true if the concrete data source reader can read data in batch according to the scan properties like required columns, pushes filters, etc.
enableHiveSupport() - Method in class org.apache.spark.sql.SparkSession.Builder: Enables Hive support, including connectivity to a persistent Hive metastore, support for Hive serdes, and Hive user-defined functions.
enableReceiverLog(SparkConf) - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
encode(Column, String) - Static method in class org.apache.spark.sql.functions: Computes the first argument into a binary from a string using the provided character set (one of 'US-ASCII', 'ISO-8859-1', 'UTF-8', 'UTF-16BE', 'UTF-16LE', 'UTF-16').
encodeFileNameToURIRawPath(String) - Static method in class org.apache.spark.util.Utils: A file name may contain some invalid URI characters, such as " ".
Encoder<T> - Interface in org.apache.spark.sql: :: Experimental :: Used to convert a JVM object of type T to and from the internal Spark SQL representation.
Encoders - Class in org.apache.spark.sql: :: Experimental :: Methods for creating an Encoder.
Encoders() - Constructor for class org.apache.spark.sql.Encoders
endOffset() - Method in class org.apache.spark.sql.streaming.SourceProgress
endOffset() - Method in exception org.apache.spark.sql.streaming.StreamingQueryException
endsWith(Column) - Method in class org.apache.spark.sql.Column: String ends with.
endsWith(String) - Method in class org.apache.spark.sql.Column: String ends with another string literal.
endTime() - Method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
endTime() - Method in class org.apache.spark.status.api.v1.streaming.OutputOperationInfo
endTime() - Method in class org.apache.spark.streaming.scheduler.OutputOperationInfo
EnsembleCombiningStrategy - Class in org.apache.spark.mllib.tree.configuration: Enum to select ensemble combining strategy for base learners
EnsembleCombiningStrategy() - Constructor for class org.apache.spark.mllib.tree.configuration.EnsembleCombiningStrategy
EnsembleModelReadWrite - Class in org.apache.spark.ml.tree
EnsembleModelReadWrite() - Constructor for class org.apache.spark.ml.tree.EnsembleModelReadWrite
EnsembleModelReadWrite.EnsembleNodeData - Class in org.apache.spark.ml.tree: Info for one Node in a tree ensemble
EnsembleModelReadWrite.EnsembleNodeData$ - Class in org.apache.spark.ml.tree
EnsembleNodeData(int, DecisionTreeModelReadWrite.NodeData) - Constructor for class org.apache.spark.ml.tree.EnsembleModelReadWrite.EnsembleNodeData
EnsembleNodeData$() - Constructor for class org.apache.spark.ml.tree.EnsembleModelReadWrite.EnsembleNodeData$
entries() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
Entropy - Class in org.apache.spark.mllib.tree.impurity: Class for calculating entropy during multiclass classification.
Entropy() - Constructor for class org.apache.spark.mllib.tree.impurity.Entropy
entrySet() - Method in class org.apache.spark.api.java.JavaUtils.SerializableMapWrapper
EnumUtil - Class in org.apache.spark.util
EnumUtil() - Constructor for class org.apache.spark.util.EnumUtil
environmentDetails() - Method in class org.apache.spark.scheduler.SparkListenerEnvironmentUpdate
environmentUpdateFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
environmentUpdateToJson(SparkListenerEnvironmentUpdate) - Static method in class org.apache.spark.util.JsonProtocol
EPSILON() - Static method in class org.apache.spark.ml.impl.Utils
epsilon() - Method in interface org.apache.spark.ml.regression.LinearRegressionParams: The shape parameter to control the amount of robustness.
eqNullSafe(Object) - Method in class org.apache.spark.sql.Column: Equality test that is safe for null values.
EqualNullSafe - Class in org.apache.spark.sql.sources: Performs equality comparison, similar to EqualTo.
EqualNullSafe(String, Object) - Constructor for class org.apache.spark.sql.sources.EqualNullSafe
equals(Object) - Method in class org.apache.spark.api.java.Optional
equals(Object) - Static method in class org.apache.spark.ExpireDeadHosts
equals(Object) - Method in class org.apache.spark.graphx.EdgeDirection
equals(Object) - Method in class org.apache.spark.HashPartitioner
equals(Object) - Method in class org.apache.spark.ml.attribute.AttributeGroup
equals(Object) - Method in class org.apache.spark.ml.attribute.BinaryAttribute
equals(Object) - Method in class org.apache.spark.ml.attribute.NominalAttribute
equals(Object) - Method in class org.apache.spark.ml.attribute.NumericAttribute
equals(Object) - Static method in class org.apache.spark.ml.feature.Dot
equals(Object) - Method in class org.apache.spark.ml.linalg.DenseMatrix
equals(Object) - Method in class org.apache.spark.ml.linalg.DenseVector
equals(Object) - Method in class org.apache.spark.ml.linalg.SparseMatrix
equals(Object) - Method in class org.apache.spark.ml.linalg.SparseVector
equals(Object) - Method in interface org.apache.spark.ml.linalg.Vector
equals(Object) - Method in class org.apache.spark.ml.param.Param
equals(Object) - Method in class org.apache.spark.ml.tree.CategoricalSplit
equals(Object) - Method in class org.apache.spark.ml.tree.ContinuousSplit
equals(Object) - Method in class org.apache.spark.mllib.linalg.DenseMatrix
equals(Object) - Method in class org.apache.spark.mllib.linalg.DenseVector
equals(Object) - Method in class org.apache.spark.mllib.linalg.SparseMatrix
equals(Object) - Method in class org.apache.spark.mllib.linalg.SparseVector
equals(Object) - Method in interface org.apache.spark.mllib.linalg.Vector
equals(Object) - Method in class org.apache.spark.mllib.linalg.VectorUDT
equals(Object) - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
equals(Object) - Method in class org.apache.spark.mllib.tree.model.Predict
equals(Object) - Method in class org.apache.spark.partial.BoundedDouble
equals(Object) - Method in interface org.apache.spark.Partition
equals(Object) - Method in class org.apache.spark.RangePartitioner
equals(Object) - Static method in class org.apache.spark.Resubmitted
equals(Object) - Static method in class org.apache.spark.rpc.netty.OnStart
equals(Object) - Static method in class org.apache.spark.rpc.netty.OnStop
equals(Object) - Static method in class org.apache.spark.scheduler.AllJobsCancelled
equals(Object) - Method in class org.apache.spark.scheduler.cluster.ExecutorInfo
equals(Object) - Method in class org.apache.spark.scheduler.InputFormatInfo
equals(Object) - Static method in class org.apache.spark.scheduler.JobSucceeded
equals(Object) - Static method in class org.apache.spark.scheduler.ResubmitFailedStages
equals(Object) - Method in class org.apache.spark.scheduler.SplitInfo
equals(Object) - Static method in class org.apache.spark.scheduler.StopCoordinator
equals(Object) - Method in class org.apache.spark.sql.Column
equals(Object) - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
equals(Object) - Static method in class org.apache.spark.sql.jdbc.OracleDialect
equals(Object) - Static method in class org.apache.spark.sql.jdbc.TeradataDialect
equals(Object) - Method in interface org.apache.spark.sql.Row
equals(Object) - Method in class org.apache.spark.sql.sources.In
equals(Object) - Method in class org.apache.spark.sql.sources.v2.reader.streaming.Offset: Equality based on JSON string representation.
equals(Object) - Static method in class org.apache.spark.sql.types.BinaryType
equals(Object) - Static method in class org.apache.spark.sql.types.BooleanType
equals(Object) - Static method in class org.apache.spark.sql.types.ByteType
equals(Object) - Static method in class org.apache.spark.sql.types.CalendarIntervalType
equals(Object) - Static method in class org.apache.spark.sql.types.DateType
equals(Object) - Method in class org.apache.spark.sql.types.Decimal
equals(Object) - Static method in class org.apache.spark.sql.types.DoubleType
equals(Object) - Static method in class org.apache.spark.sql.types.FloatType
equals(Object) - Static method in class org.apache.spark.sql.types.IntegerType
equals(Object) - Static method in class org.apache.spark.sql.types.LongType
equals(Object) - Method in class org.apache.spark.sql.types.Metadata
equals(Object) - Static method in class org.apache.spark.sql.types.NullType
equals(Object) - Static method in class org.apache.spark.sql.types.ShortType
equals(Object) - Static method in class org.apache.spark.sql.types.StringType
equals(Object) - Method in class org.apache.spark.sql.types.StructType
equals(Object) - Static method in class org.apache.spark.sql.types.TimestampType
equals(Object) - Static method in class org.apache.spark.StopMapOutputTracker
equals(Object) - Method in class org.apache.spark.storage.BlockManagerId
equals(Object) - Method in class org.apache.spark.storage.StorageLevel
equals(Object) - Static method in class org.apache.spark.streaming.kinesis.DefaultCredentials
equals(Object) - Static method in class org.apache.spark.streaming.scheduler.AllReceiverIds
equals(Object) - Static method in class org.apache.spark.streaming.scheduler.GetAllReceiverInfo
equals(Object) - Static method in class org.apache.spark.streaming.scheduler.StopAllReceivers
equals(Object) - Static method in class org.apache.spark.Success
equals(Object) - Static method in class org.apache.spark.TaskResultLost
equals(Object) - Static method in class org.apache.spark.TaskSchedulerIsSet
equals(Object) - Static method in class org.apache.spark.UnknownReason
equalsStructurally(DataType, DataType, boolean) - Static method in class org.apache.spark.sql.types.DataType: Returns true if the two data types share the same "shape", i.e.
equalTo(Object) - Method in class org.apache.spark.sql.Column: Equality test.
EqualTo - Class in org.apache.spark.sql.sources: A filter that evaluates to true iff the attribute evaluates to a value equal to value.
EqualTo(String, Object) - Constructor for class org.apache.spark.sql.sources.EqualTo
err(String) - Static method in class org.apache.spark.ml.feature.RFormulaParser
ERROR() - Static method in class org.apache.spark.status.TaskIndexNames
ErrorHandlingReadableChannel(ReadableByteChannel, ReadableByteChannel) - Constructor for class org.apache.spark.security.CryptoStreamUtils.ErrorHandlingReadableChannel
errorMessage() - Method in class org.apache.spark.status.api.v1.TaskData
errorMessage() - Method in class org.apache.spark.status.LiveTask
estimate(double[]) - Method in class org.apache.spark.mllib.stat.KernelDensity: Estimates probability density function at the given array of points.
estimate(Object) - Static method in class org.apache.spark.util.SizeEstimator: Estimate the number of bytes that the given object takes up on the JVM heap.
estimateCount(Object) - Method in class org.apache.spark.util.sketch.CountMinSketch: Returns the estimated frequency of item.
estimatedDocConcentration() - Method in class org.apache.spark.ml.clustering.LDAModel: Value for docConcentration estimated from data.
estimatedSize() - Method in class org.apache.spark.storage.memory.DeserializedValuesHolder
estimatedSize() - Method in class org.apache.spark.storage.memory.SerializedValuesHolder
estimatedSize() - Method in interface org.apache.spark.storage.memory.ValuesHolder
estimatedSize() - Method in interface org.apache.spark.util.KnownSizeEstimation
estimateStatistics() - Method in interface org.apache.spark.sql.sources.v2.reader.SupportsReportStatistics: Returns the estimated statistics of this data source.
Estimator<M extends Model<M>> - Class in org.apache.spark.ml: :: DeveloperApi :: Abstract class for estimators that fit models to data.
Estimator() - Constructor for class org.apache.spark.ml.Estimator
estimator() - Method in interface org.apache.spark.ml.tuning.ValidatorParams: param for the estimator to be validated
estimatorParamMaps() - Method in interface org.apache.spark.ml.tuning.ValidatorParams: param for estimator param maps
eval() - Method in interface org.apache.spark.ml.ann.ActivationFunction: Implements a function
eval(DenseMatrix<Object>, DenseMatrix<Object>) - Method in interface org.apache.spark.ml.ann.LayerModel: Evaluates the data (process the data through the layer).
evaluate(Dataset<?>) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel: Evaluates the model on a test dataset.
evaluate(Dataset<?>) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
evaluate(Dataset<?>) - Method in class org.apache.spark.ml.evaluation.ClusteringEvaluator
evaluate(Dataset<?>, ParamMap) - Method in class org.apache.spark.ml.evaluation.Evaluator: Evaluates model output and returns a scalar metric.
evaluate(Dataset<?>) - Method in class org.apache.spark.ml.evaluation.Evaluator: Evaluates model output and returns a scalar metric.
evaluate(Dataset<?>) - Method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
evaluate(Dataset<?>) - Method in class org.apache.spark.ml.evaluation.RegressionEvaluator
evaluate(Dataset<?>) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionModel: Evaluate the model on the given dataset, returning a summary of the results.
evaluate(Dataset<?>) - Method in class org.apache.spark.ml.regression.LinearRegressionModel: Evaluates the model on a test dataset.
evaluate(Row) - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction: Calculates the final result of this UserDefinedAggregateFunction based on the given aggregation buffer.
evaluateEachIteration(Dataset<?>) - Method in class org.apache.spark.ml.classification.GBTClassificationModel: Method to compute error or loss for every iteration of gradient boosting.
evaluateEachIteration(Dataset<?>, String) - Method in class org.apache.spark.ml.regression.GBTRegressionModel: Method to compute error or loss for every iteration of gradient boosting.
evaluateEachIteration(RDD<LabeledPoint>, DecisionTreeRegressionModel[], double[], Loss, Enumeration.Value) - Static method in class org.apache.spark.ml.tree.impl.GradientBoostedTrees: Method to compute error or loss for every iteration of gradient boosting.
evaluateEachIteration(RDD<LabeledPoint>, Loss) - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel: Method to compute error or loss for every iteration of gradient boosting.
Evaluator - Class in org.apache.spark.ml.evaluation: :: DeveloperApi :: Abstract class for evaluators that compute metrics from predictions.
Evaluator() - Constructor for class org.apache.spark.ml.evaluation.Evaluator
evaluator() - Method in interface org.apache.spark.ml.tuning.ValidatorParams: param for the evaluator used to select hyper-parameters that maximize the validated metric
eventRates() - Method in class org.apache.spark.status.api.v1.streaming.ReceiverInfo
eventTime() - Method in class org.apache.spark.sql.streaming.StreamingQueryProgress
EventTimeTimeout() - Static method in class org.apache.spark.sql.streaming.GroupStateTimeout: Timeout based on event-time.
except(Dataset<T>) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset containing rows in this Dataset but not in another Dataset.
exceptAll(Dataset<T>) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset containing rows in this Dataset but not in another Dataset while preserving the duplicates.
exception() - Method in class org.apache.spark.ExceptionFailure
exception() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformationWriterThread: Contains the exception thrown while writing the parent iterator to the external process.
exception() - Method in interface org.apache.spark.sql.streaming.StreamingQuery: Returns the StreamingQueryException if the query was terminated by an exception.
exception() - Method in class org.apache.spark.sql.streaming.StreamingQueryListener.QueryTerminatedEvent
ExceptionFailure - Class in org.apache.spark: :: DeveloperApi :: Task failed due to a runtime exception.
ExceptionFailure(String, String, StackTraceElement[], String, Option<ThrowableSerializationWrapper>, Seq<AccumulableInfo>, Seq<AccumulatorV2<?, ?>>) - Constructor for class org.apache.spark.ExceptionFailure
exceptionFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
exceptionString(Throwable) - Static method in class org.apache.spark.util.Utils: Return a nice string representation of the exception.
exceptionToJson(Exception) - Static method in class org.apache.spark.util.JsonProtocol
EXEC_CPU_TIME() - Static method in class org.apache.spark.status.TaskIndexNames
EXEC_RUN_TIME() - Static method in class org.apache.spark.status.TaskIndexNames
execId() - Method in class org.apache.spark.ExecutorLostFailure
execId() - Method in class org.apache.spark.scheduler.SparkListenerExecutorMetricsUpdate
execId() - Method in class org.apache.spark.storage.BlockManagerMessages.RemoveExecutor
executeAndGetOutput(Seq<String>, File, Map<String, String>, boolean) - Static method in class org.apache.spark.util.Utils: Execute a command and get its output, throwing an exception if it yields a code other than 0.
executeCommand(Seq<String>, File, Map<String, String>, boolean) - Static method in class org.apache.spark.util.Utils: Execute a command and return the process running the command.
executionId() - Method in interface org.apache.spark.sql.hive.execution.SaveAsHiveFile
ExecutionListenerManager - Class in org.apache.spark.sql.util: :: Experimental ::
executor() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillTask
EXECUTOR() - Static method in class org.apache.spark.status.TaskIndexNames
EXECUTOR_CORES - Static variable in class org.apache.spark.launcher.SparkLauncher: Configuration key for the number of executor CPU cores.
EXECUTOR_CPU_TIME() - Static method in class org.apache.spark.InternalAccumulator
EXECUTOR_DESERIALIZE_CPU_TIME() - Static method in class org.apache.spark.InternalAccumulator
EXECUTOR_DESERIALIZE_TIME() - Static method in class org.apache.spark.InternalAccumulator
EXECUTOR_EXTRA_CLASSPATH - Static variable in class org.apache.spark.launcher.SparkLauncher: Configuration key for the executor class path.
EXECUTOR_EXTRA_JAVA_OPTIONS - Static variable in class org.apache.spark.launcher.SparkLauncher: Configuration key for the executor VM options.
EXECUTOR_EXTRA_LIBRARY_PATH - Static variable in class org.apache.spark.launcher.SparkLauncher: Configuration key for the executor native library path.
EXECUTOR_MEMORY - Static variable in class org.apache.spark.launcher.SparkLauncher: Configuration key for the executor memory.
EXECUTOR_RUN_TIME() - Static method in class org.apache.spark.InternalAccumulator
executorAddedFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
executorAddedToJson(SparkListenerExecutorAdded) - Static method in class org.apache.spark.util.JsonProtocol
ExecutorAllocationClient - Interface in org.apache.spark: A client that communicates with the cluster manager to request or kill executors.
executorCpuTime() - Method in class org.apache.spark.status.api.v1.StageData
executorCpuTime() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
executorCpuTime() - Method in class org.apache.spark.status.api.v1.TaskMetrics
executorDeserializeCpuTime() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
executorDeserializeCpuTime() - Method in class org.apache.spark.status.api.v1.TaskMetrics
executorDeserializeTime() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
executorDeserializeTime() - Method in class org.apache.spark.status.api.v1.TaskMetrics
executorFailures() - Method in class org.apache.spark.scheduler.SparkListenerNodeBlacklisted
executorFailures() - Method in class org.apache.spark.scheduler.SparkListenerNodeBlacklistedForStage
executorHeartbeatReceived(String, Tuple2<Object, Seq<AccumulatorV2<?, ?>>>[], BlockManagerId) - Method in interface org.apache.spark.scheduler.TaskScheduler: Update metrics for in-progress tasks and let the master know that the BlockManager is still alive.
executorHost() - Method in class org.apache.spark.scheduler.cluster.ExecutorInfo
executorHost() - Method in class org.apache.spark.status.api.v1.streaming.ReceiverInfo
executorId() - Method in class org.apache.spark.ExecutorRegistered
executorId() - Method in class org.apache.spark.ExecutorRemoved
executorId() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.GetExecutorLossReason
executorId() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor
executorId() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveExecutor
executorId() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate
executorId() - Method in class org.apache.spark.scheduler.SparkListenerExecutorAdded
executorId() - Method in class org.apache.spark.scheduler.SparkListenerExecutorBlacklisted
executorId() - Method in class org.apache.spark.scheduler.SparkListenerExecutorBlacklistedForStage
executorId() - Method in class org.apache.spark.scheduler.SparkListenerExecutorRemoved
executorId() - Method in class org.apache.spark.scheduler.SparkListenerExecutorUnblacklisted
executorId() - Method in class org.apache.spark.scheduler.TaskInfo
executorId() - Method in class org.apache.spark.SparkEnv
executorId() - Method in class org.apache.spark.status.api.v1.streaming.ReceiverInfo
executorId() - Method in class org.apache.spark.status.api.v1.TaskData
executorId() - Method in class org.apache.spark.status.LiveExecutor
executorId() - Method in class org.apache.spark.status.LiveRDDDistribution
executorId() - Method in class org.apache.spark.storage.BlockManagerId
executorId() - Method in class org.apache.spark.storage.BlockManagerMessages.GetExecutorEndpointRef
executorId() - Method in class org.apache.spark.storage.BlockManagerMessages.HasCachedBlocks
executorId() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
executorId() - Method in class org.apache.spark.ui.storage.ExecutorStreamSummary
executorIds() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillExecutors
ExecutorInfo - Class in org.apache.spark.scheduler.cluster: :: DeveloperApi :: Stores information about an executor to pass from the scheduler to SparkListeners.
ExecutorInfo(String, int, Map<String, String>) - Constructor for class org.apache.spark.scheduler.cluster.ExecutorInfo
executorInfo() - Method in class org.apache.spark.scheduler.SparkListenerExecutorAdded
executorInfoFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
executorInfoToJson(ExecutorInfo) - Static method in class org.apache.spark.util.JsonProtocol
ExecutorKilled - Class in org.apache.spark.scheduler
ExecutorKilled() - Constructor for class org.apache.spark.scheduler.ExecutorKilled
executorLogs() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
executorLogs() - Method in class org.apache.spark.status.LiveExecutor
executorLost(String, String, ExecutorLossReason) - Method in interface org.apache.spark.scheduler.Schedulable
executorLost(String, ExecutorLossReason) - Method in interface org.apache.spark.scheduler.TaskScheduler: Process a lost executor
ExecutorLostFailure - Class in org.apache.spark: :: DeveloperApi :: The task failed because the executor that it was running on was lost.
ExecutorLostFailure(String, boolean, Option<String>) - Constructor for class org.apache.spark.ExecutorLostFailure
executorMetricsUpdateFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
executorMetricsUpdateToJson(SparkListenerExecutorMetricsUpdate) - Static method in class org.apache.spark.util.JsonProtocol
executorPct() - Method in class org.apache.spark.scheduler.RuntimePercentage
ExecutorPlugin - Interface in org.apache.spark: A plugin which can be automatically instantiated within each Spark executor.
executorRef() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor
ExecutorRegistered - Class in org.apache.spark
ExecutorRegistered(String) - Constructor for class org.apache.spark.ExecutorRegistered
ExecutorRemoved - Class in org.apache.spark
ExecutorRemoved(String) - Constructor for class org.apache.spark.ExecutorRemoved
executorRemovedFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
executorRemovedToJson(SparkListenerExecutorRemoved) - Static method in class org.apache.spark.util.JsonProtocol
executorRunTime() - Method in class org.apache.spark.status.api.v1.StageData
executorRunTime() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
executorRunTime() - Method in class org.apache.spark.status.api.v1.TaskMetrics
executors() - Method in class org.apache.spark.status.api.v1.RDDPartitionInfo
executors() - Method in class org.apache.spark.status.LiveRDDPartition
ExecutorStageSummary - Class in org.apache.spark.status.api.v1
ExecutorStreamSummary - Class in org.apache.spark.ui.storage
ExecutorStreamSummary(Seq<org.apache.spark.status.StreamBlockData>) - Constructor for class org.apache.spark.ui.storage.ExecutorStreamSummary
executorSummaries() - Method in class org.apache.spark.status.LiveStage
ExecutorSummary - Class in org.apache.spark.status.api.v1
executorSummary() - Method in class org.apache.spark.status.api.v1.StageData
executorSummary(String) - Method in class org.apache.spark.status.LiveStage
exists() - Method in interface org.apache.spark.sql.streaming.GroupState: Whether state exists or not.
exists(String) - Static method in class org.apache.spark.sql.types.UDTRegistration: Queries if a given user class is already registered or not.
exists() - Method in class org.apache.spark.streaming.State: Whether the state already exists
exitCausedByApp() - Method in class org.apache.spark.ExecutorLostFailure
exitFn() - Method in interface org.apache.spark.util.CommandLineUtils
exp(Column) - Static method in class org.apache.spark.sql.functions: Computes the exponential of the given value.
exp(String) - Static method in class org.apache.spark.sql.functions: Computes the exponential of the given column.
ExpectationAggregator - Class in org.apache.spark.ml.clustering: ExpectationAggregator computes the partial expectation results.
ExpectationAggregator(int, Broadcast<double[]>, Broadcast<Tuple2<DenseVector, DenseVector>[]>) - Constructor for class org.apache.spark.ml.clustering.ExpectationAggregator
ExpectationSum - Class in org.apache.spark.mllib.clustering
ExpectationSum(double, double[], DenseVector<Object>[], DenseMatrix<Object>[]) - Constructor for class org.apache.spark.mllib.clustering.ExpectationSum
expectedFpp() - Method in class org.apache.spark.util.sketch.BloomFilter: Returns the probability that BloomFilter.mightContain(Object) erroneously return true for an object that has not actually been put in the BloomFilter.
experimental() - Method in class org.apache.spark.sql.SparkSession: :: Experimental :: A collection of methods that are considered experimental, but can be used to hook into the query planner for advanced functionality.
experimental() - Method in class org.apache.spark.sql.SQLContext: :: Experimental :: A collection of methods that are considered experimental, but can be used to hook into the query planner for advanced functionality.
ExperimentalMethods - Class in org.apache.spark.sql: :: Experimental :: Holder for experimental methods for the bravest.
ExpireDeadHosts - Class in org.apache.spark
ExpireDeadHosts() - Constructor for class org.apache.spark.ExpireDeadHosts
expiryTime() - Method in class org.apache.spark.scheduler.BlacklistedExecutor
explain(boolean) - Method in class org.apache.spark.sql.Column: Prints the expression to the console for debugging purposes.
explain(boolean) - Method in class org.apache.spark.sql.Dataset: Prints the plans (logical and physical) to the console for debugging purposes.
explain() - Method in class org.apache.spark.sql.Dataset: Prints the physical plan to the console for debugging purposes.
explain() - Method in interface org.apache.spark.sql.streaming.StreamingQuery: Prints the physical plan to the console for debugging purposes.
explain(boolean) - Method in interface org.apache.spark.sql.streaming.StreamingQuery: Prints the physical plan to the console for debugging purposes.
explainedVariance() - Method in class org.apache.spark.ml.feature.PCAModel
explainedVariance() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary: Returns the explained variance regression score.
explainedVariance() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics: Returns the variance explained by regression.
explainedVariance() - Method in class org.apache.spark.mllib.feature.PCAModel
explainParam(Param<?>) - Method in interface org.apache.spark.ml.param.Params: Explains a param.
explainParams() - Method in interface org.apache.spark.ml.param.Params: Explains all params of this instance.
explode(Seq<Column>, Function1<Row, TraversableOnce<A>>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.Dataset: Deprecated.
use flatMap() or select() with functions.explode() instead. Since 2.0.0.
explode(String, String, Function1<A, TraversableOnce>, TypeTags.TypeTag) - Method in class org.apache.spark.sql.Dataset: Deprecated.
use flatMap() or select() with functions.explode() instead. Since 2.0.0.
explode(Column) - Static method in class org.apache.spark.sql.functions: Creates a new row for each element in the given array or map column.
explode_outer(Column) - Static method in class org.apache.spark.sql.functions: Creates a new row for each element in the given array or map column.
expm1(Column) - Static method in class org.apache.spark.sql.functions: Computes the exponential of the given value minus one.
expm1(String) - Static method in class org.apache.spark.sql.functions: Computes the exponential of the given column minus one.
ExponentialGenerator - Class in org.apache.spark.mllib.random: :: DeveloperApi :: Generates i.i.d.
ExponentialGenerator(double) - Constructor for class org.apache.spark.mllib.random.ExponentialGenerator
exponentialJavaRDD(JavaSparkContext, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Java-friendly version of RandomRDDs.exponentialRDD.
exponentialJavaRDD(JavaSparkContext, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.exponentialJavaRDD with the default seed.
exponentialJavaRDD(JavaSparkContext, double, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.exponentialJavaRDD with the default number of partitions and the default seed.
exponentialJavaVectorRDD(JavaSparkContext, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Java-friendly version of RandomRDDs.exponentialVectorRDD.
exponentialJavaVectorRDD(JavaSparkContext, double, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.exponentialJavaVectorRDD with the default seed.
exponentialJavaVectorRDD(JavaSparkContext, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.exponentialJavaVectorRDD with the default number of partitions and the default seed.
exponentialRDD(SparkContext, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Generates an RDD comprised of i.i.d. samples from the exponential distribution with the input mean.
exponentialVectorRDD(SparkContext, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Generates an RDD[Vector] with vectors containing i.i.d. samples drawn from the exponential distribution with the input mean.
expr() - Method in class org.apache.spark.sql.Column
expr(String) - Static method in class org.apache.spark.sql.functions: Parses the expression string into the column that it represents, similar to Dataset.selectExpr(java.lang.String...).
Expression$() - Constructor for class org.apache.spark.sql.types.DecimalType.Expression$
extensionsForCompressionCodecNames() - Static method in class org.apache.spark.sql.hive.orc.OrcFileFormat
externalBlockStoreSize() - Method in class org.apache.spark.storage.RDDInfo
ExternalClusterManager - Interface in org.apache.spark.scheduler: A cluster manager interface to plugin external scheduler.
extractDistribution(Function1<BatchInfo, Option<Object>>) - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
extractDoubleDistribution(Seq<Tuple2<TaskInfo, TaskMetrics>>, Function2<TaskInfo, TaskMetrics, Object>) - Static method in class org.apache.spark.scheduler.StatsReportListener
extractFn() - Method in class org.apache.spark.ui.JettyUtils.ServletParams
extractHostPortFromSparkUrl(String) - Static method in class org.apache.spark.util.Utils: Return a pair of host and port extracted from the sparkUrl.
extractLongDistribution(Seq<Tuple2<TaskInfo, TaskMetrics>>, Function2<TaskInfo, TaskMetrics, Object>) - Static method in class org.apache.spark.scheduler.StatsReportListener
extractParamMap(ParamMap) - Method in interface org.apache.spark.ml.param.Params: Extracts the embedded default param values and user-supplied values, and then merges them with extra values from input into a flat param map, where the latter value is used if there exist conflicts, i.e., with ordering: default param values less than user-supplied values less than extra.
extractParamMap() - Method in interface org.apache.spark.ml.param.Params: extractParamMap with no extra values.
extractWeightedLabeledPoints(Dataset<?>) - Method in interface org.apache.spark.ml.regression.IsotonicRegressionBase: Extracts (label, feature, weight) from input dataset.
extraOptimizations() - Method in class org.apache.spark.sql.ExperimentalMethods
extraStrategies() - Method in class org.apache.spark.sql.ExperimentalMethods: Allows extra strategies to be injected into the query planner at runtime.
eye(int) - Static method in class org.apache.spark.ml.linalg.DenseMatrix: Generate an Identity Matrix in DenseMatrix format.
eye(int) - Static method in class org.apache.spark.ml.linalg.Matrices: Generate a dense Identity Matrix in Matrix format.
eye(int) - Static method in class org.apache.spark.mllib.linalg.DenseMatrix: Generate an Identity Matrix in DenseMatrix format.
eye(int) - Static method in class org.apache.spark.mllib.linalg.Matrices: Generate a dense Identity Matrix in Matrix format.

F

f() - Method in class org.apache.spark.sql.expressions.UserDefinedFunction
f1Measure() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics: Returns document-based f1-measure averaged by the number of documents
f1Measure(double) - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics: Returns f1-measure for a given label (category)
factorial(Column) - Static method in class org.apache.spark.sql.functions: Computes the factorial of the given value.
failed() - Method in class org.apache.spark.scheduler.TaskInfo
FAILED() - Static method in class org.apache.spark.TaskState
failedStages() - Method in class org.apache.spark.status.LiveJob
failedTasks() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
failedTasks() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
failedTasks() - Method in class org.apache.spark.status.LiveExecutor
failedTasks() - Method in class org.apache.spark.status.LiveExecutorStageSummary
failedTasks() - Method in class org.apache.spark.status.LiveJob
failedTasks() - Method in class org.apache.spark.status.LiveStage
failure(String) - Static method in class org.apache.spark.ml.feature.RFormulaParser
failureReason() - Method in class org.apache.spark.scheduler.StageInfo: If the stage failed, the reason why.
failureReason() - Method in class org.apache.spark.status.api.v1.StageData
failureReason() - Method in class org.apache.spark.status.api.v1.streaming.OutputOperationInfo
failureReason() - Method in class org.apache.spark.streaming.scheduler.OutputOperationInfo
failureReasonCell(String, int, boolean) - Static method in class org.apache.spark.streaming.ui.UIUtils
FAIR() - Static method in class org.apache.spark.scheduler.SchedulingMode
FAKE_HIVE_VERSION() - Static method in class org.apache.spark.sql.hive.HiveUtils
FalsePositiveRate - Class in org.apache.spark.mllib.evaluation.binary: False positive rate.
FalsePositiveRate() - Constructor for class org.apache.spark.mllib.evaluation.binary.FalsePositiveRate
falsePositiveRate(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns false positive rate for a given label (category)
falsePositiveRateByLabel() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary: Returns false positive rate for each label (category).
family() - Method in interface org.apache.spark.ml.classification.LogisticRegressionParams: Param for the name of family which is a description of the label distribution to be used in the model.
family() - Method in interface org.apache.spark.ml.regression.GeneralizedLinearRegressionBase: Param for the name of family which is a description of the error distribution to be used in the model.
Family$() - Constructor for class org.apache.spark.ml.regression.GeneralizedLinearRegression.Family$
FamilyAndLink$() - Constructor for class org.apache.spark.ml.regression.GeneralizedLinearRegression.FamilyAndLink$
fdr() - Method in interface org.apache.spark.ml.feature.ChiSqSelectorParams: The upper bound of the expected false discovery rate.
fdr() - Method in class org.apache.spark.mllib.feature.ChiSqSelector
feature() - Method in class org.apache.spark.mllib.feature.ChiSqSelectorModel.SaveLoadV1_0$.Data
feature() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData
feature() - Method in class org.apache.spark.mllib.tree.model.Split
FeatureHasher - Class in org.apache.spark.ml.feature: Feature hashing projects a set of categorical or numerical features into a feature vector of specified dimension (typically substantially smaller than that of the original feature space).
FeatureHasher(String) - Constructor for class org.apache.spark.ml.feature.FeatureHasher
FeatureHasher() - Constructor for class org.apache.spark.ml.feature.FeatureHasher
featureImportances() - Method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel: Estimate of the importance of each feature.
featureImportances() - Method in class org.apache.spark.ml.classification.GBTClassificationModel: Estimate of the importance of each feature.
featureImportances() - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel: Estimate of the importance of each feature.
featureImportances() - Method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel: Estimate of the importance of each feature.
featureImportances() - Method in class org.apache.spark.ml.regression.GBTRegressionModel: Estimate of the importance of each feature.
featureImportances() - Method in class org.apache.spark.ml.regression.RandomForestRegressionModel: Estimate of the importance of each feature.
featureIndex() - Method in interface org.apache.spark.ml.regression.IsotonicRegressionBase: Param for the index of the feature if featuresCol is a vector column (default: 0), no effect otherwise.
featureIndex() - Method in class org.apache.spark.ml.tree.CategoricalSplit
featureIndex() - Method in class org.apache.spark.ml.tree.ContinuousSplit
featureIndex() - Method in class org.apache.spark.ml.tree.DecisionTreeModelReadWrite.SplitData
featureIndex() - Method in interface org.apache.spark.ml.tree.Split: Index of feature which this split tests
features() - Method in class org.apache.spark.ml.feature.LabeledPoint
features() - Method in class org.apache.spark.mllib.regression.LabeledPoint
featuresCol() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary: Field in "predictions" which gives the features of each instance as a vector.
featuresCol() - Method in class org.apache.spark.ml.classification.LogisticRegressionSummaryImpl
featuresCol() - Method in class org.apache.spark.ml.clustering.ClusteringSummary
featuresCol() - Method in interface org.apache.spark.ml.param.shared.HasFeaturesCol: Param for features column name.
featuresCol() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary
featureSubsetStrategy() - Method in interface org.apache.spark.ml.tree.TreeEnsembleParams: The number of features to consider for splits at each tree node.
featureSum() - Method in class org.apache.spark.ml.evaluation.SquaredEuclideanSilhouette.ClusterStats
FeatureType - Class in org.apache.spark.mllib.tree.configuration: Enum to describe whether a feature is "continuous" or "categorical"
FeatureType() - Constructor for class org.apache.spark.mllib.tree.configuration.FeatureType
featureType() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData
featureType() - Method in class org.apache.spark.mllib.tree.model.Split
FETCH_WAIT_TIME() - Method in class org.apache.spark.InternalAccumulator.shuffleRead$
FetchFailed - Class in org.apache.spark: :: DeveloperApi :: Task failed to fetch shuffle data from a remote node.
FetchFailed(BlockManagerId, int, int, int, String) - Constructor for class org.apache.spark.FetchFailed
fetchFile(String, File, SparkConf, org.apache.spark.SecurityManager, Configuration, long, boolean) - Static method in class org.apache.spark.util.Utils: Download a file or directory to target directory.
fetchPct() - Method in class org.apache.spark.scheduler.RuntimePercentage
fetchWaitTime() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetricDistributions
fetchWaitTime() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetrics
field() - Method in class org.apache.spark.storage.BroadcastBlockId
fieldIndex(String) - Method in interface org.apache.spark.sql.Row: Returns the index of a given field name.
fieldIndex(String) - Method in class org.apache.spark.sql.types.StructType: Returns the index of a given field.
fieldNames() - Method in class org.apache.spark.sql.types.StructType: Returns all field names in an array.
fields() - Method in class org.apache.spark.sql.types.StructType
FIFO() - Static method in class org.apache.spark.scheduler.SchedulingMode
FILE_FORMAT() - Static method in class org.apache.spark.sql.hive.execution.HiveOptions
FileBasedTopologyMapper - Class in org.apache.spark.storage: A simple file based topology mapper.
FileBasedTopologyMapper(SparkConf) - Constructor for class org.apache.spark.storage.FileBasedTopologyMapper
FileCommitProtocol - Class in org.apache.spark.internal.io: An interface to define how a single Spark job commits its outputs.
FileCommitProtocol() - Constructor for class org.apache.spark.internal.io.FileCommitProtocol
FileCommitProtocol.EmptyTaskCommitMessage$ - Class in org.apache.spark.internal.io
FileCommitProtocol.TaskCommitMessage - Class in org.apache.spark.internal.io
fileFormat() - Method in class org.apache.spark.sql.hive.execution.HiveOptions
files() - Method in class org.apache.spark.SparkContext
fileStream(String, Class<K>, Class<V>, Class<F>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream that monitors a Hadoop-compatible filesystem for new files and reads them using the given key-value types and input format.
fileStream(String, Class<K>, Class<V>, Class<F>, Function<Path, Boolean>, boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream that monitors a Hadoop-compatible filesystem for new files and reads them using the given key-value types and input format.
fileStream(String, Class<K>, Class<V>, Class<F>, Function<Path, Boolean>, boolean, Configuration) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream that monitors a Hadoop-compatible filesystem for new files and reads them using the given key-value types and input format.
fileStream(String, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.streaming.StreamingContext: Create an input stream that monitors a Hadoop-compatible filesystem for new files and reads them using the given key-value types and input format.
fileStream(String, Function1<Path, Object>, boolean, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.streaming.StreamingContext: Create an input stream that monitors a Hadoop-compatible filesystem for new files and reads them using the given key-value types and input format.
fileStream(String, Function1<Path, Object>, boolean, Configuration, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.streaming.StreamingContext: Create an input stream that monitors a Hadoop-compatible filesystem for new files and reads them using the given key-value types and input format.
fill(long) - Method in class org.apache.spark.sql.DataFrameNaFunctions: Returns a new DataFrame that replaces null or NaN values in numeric columns with value.
fill(double) - Method in class org.apache.spark.sql.DataFrameNaFunctions: Returns a new DataFrame that replaces null or NaN values in numeric columns with value.
fill(String) - Method in class org.apache.spark.sql.DataFrameNaFunctions: Returns a new DataFrame that replaces null values in string columns with value.
fill(long, String[]) - Method in class org.apache.spark.sql.DataFrameNaFunctions: Returns a new DataFrame that replaces null or NaN values in specified numeric columns.
fill(double, String[]) - Method in class org.apache.spark.sql.DataFrameNaFunctions: Returns a new DataFrame that replaces null or NaN values in specified numeric columns.
fill(long, Seq<String>) - Method in class org.apache.spark.sql.DataFrameNaFunctions: (Scala-specific) Returns a new DataFrame that replaces null or NaN values in specified numeric columns.
fill(double, Seq<String>) - Method in class org.apache.spark.sql.DataFrameNaFunctions: (Scala-specific) Returns a new DataFrame that replaces null or NaN values in specified numeric columns.
fill(String, String[]) - Method in class org.apache.spark.sql.DataFrameNaFunctions: Returns a new DataFrame that replaces null values in specified string columns.
fill(String, Seq<String>) - Method in class org.apache.spark.sql.DataFrameNaFunctions: (Scala-specific) Returns a new DataFrame that replaces null values in specified string columns.
fill(boolean) - Method in class org.apache.spark.sql.DataFrameNaFunctions: Returns a new DataFrame that replaces null values in boolean columns with value.
fill(boolean, Seq<String>) - Method in class org.apache.spark.sql.DataFrameNaFunctions: (Scala-specific) Returns a new DataFrame that replaces null values in specified boolean columns.
fill(boolean, String[]) - Method in class org.apache.spark.sql.DataFrameNaFunctions: Returns a new DataFrame that replaces null values in specified boolean columns.
fill(Map<String, Object>) - Method in class org.apache.spark.sql.DataFrameNaFunctions: Returns a new DataFrame that replaces null values.
fill(Map<String, Object>) - Method in class org.apache.spark.sql.DataFrameNaFunctions: (Scala-specific) Returns a new DataFrame that replaces null values.
filter(Function<Double, Boolean>) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a new RDD containing only the elements that satisfy a predicate.
filter(Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a new RDD containing only the elements that satisfy a predicate.
filter(Function<T, Boolean>) - Method in class org.apache.spark.api.java.JavaRDD: Return a new RDD containing only the elements that satisfy a predicate.
filter(Function1<Graph<VD, ED>, Graph<VD2, ED2>>, Function1<EdgeTriplet<VD2, ED2>, Object>, Function2<Object, VD2, Object>, ClassTag<VD2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.GraphOps: Filter the graph by computing some values to filter on, and applying the predicates.
filter(Function1<EdgeTriplet<VD, ED>, Object>, Function2<Object, VD, Object>) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
filter(Function1<Tuple2<Object, VD>, Object>) - Method in class org.apache.spark.graphx.VertexRDD: Restricts the vertex set to the set of vertices satisfying the given predicate.
filter(Params) - Method in class org.apache.spark.ml.param.ParamMap: Filters this param map for the given parent.
filter(Function1<T, Object>) - Method in class org.apache.spark.rdd.RDD: Return a new RDD containing only the elements that satisfy a predicate.
filter(Column) - Method in class org.apache.spark.sql.Dataset: Filters rows using the given condition.
filter(String) - Method in class org.apache.spark.sql.Dataset: Filters rows using the given SQL expression.
filter(Function1<T, Object>) - Method in class org.apache.spark.sql.Dataset: :: Experimental :: (Scala-specific) Returns a new Dataset that only contains elements where func returns true.
filter(FilterFunction<T>) - Method in class org.apache.spark.sql.Dataset: :: Experimental :: (Java-specific) Returns a new Dataset that only contains elements where func returns true.
Filter - Class in org.apache.spark.sql.sources: A filter predicate for data sources.
Filter() - Constructor for class org.apache.spark.sql.sources.Filter
filter() - Method in class org.apache.spark.storage.BlockManagerMessages.GetMatchingBlockIds
filter(Function<T, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaDStream: Return a new DStream containing only the elements that satisfy a predicate.
filter(Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream containing only the elements that satisfy a predicate.
filter(Function1<T, Object>) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream containing only the elements that satisfy a predicate.
filterByRange(K, K) - Method in class org.apache.spark.rdd.OrderedRDDFunctions: Returns an RDD containing only the elements in the inclusive range lower to upper.
FilterFunction<T> - Interface in org.apache.spark.api.java.function: Base interface for a function used in Dataset's filter function.
filterName() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter
filterParams() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter
finalStorageLevel() - Method in interface org.apache.spark.ml.recommendation.ALSParams: Param for StorageLevel for ALS model factors.
findClass(String) - Method in class org.apache.spark.util.ParentClassLoader
findFrequentSequentialPatterns(Dataset<?>) - Method in class org.apache.spark.ml.fpm.PrefixSpan: :: Experimental :: Finds the complete set of frequent sequential patterns in the input sequences of itemsets.
findListenersByClass(ClassTag<T>) - Method in interface org.apache.spark.util.ListenerBus
findMissingPartitions() - Method in class org.apache.spark.ShuffleStatus: Returns the sequence of partition ids that are missing (i.e.
findSynonyms(String, int) - Method in class org.apache.spark.ml.feature.Word2VecModel: Find "num" number of words closest in similarity to the given word, not including the word itself.
findSynonyms(Vector, int) - Method in class org.apache.spark.ml.feature.Word2VecModel: Find "num" number of words whose vector representation is most similar to the supplied vector.
findSynonyms(String, int) - Method in class org.apache.spark.mllib.feature.Word2VecModel: Find synonyms of a word; do not include the word itself in results.
findSynonyms(Vector, int) - Method in class org.apache.spark.mllib.feature.Word2VecModel: Find synonyms of the vector representation of a word, possibly including any words in the model vocabulary whose vector respresentation is the supplied vector.
findSynonymsArray(Vector, int) - Method in class org.apache.spark.ml.feature.Word2VecModel: Find "num" number of words whose vector representation is most similar to the supplied vector.
findSynonymsArray(String, int) - Method in class org.apache.spark.ml.feature.Word2VecModel: Find "num" number of words closest in similarity to the given word, not including the word itself.
finish(BUF) - Method in class org.apache.spark.sql.expressions.Aggregator: Transform the output of the reduction.
finished() - Method in class org.apache.spark.scheduler.TaskInfo
FINISHED() - Static method in class org.apache.spark.TaskState
finishTime() - Method in class org.apache.spark.scheduler.TaskInfo: The time when the task has completed successfully (including the time to remotely fetch results, if necessary).
first() - Method in class org.apache.spark.api.java.JavaDoubleRDD
first() - Method in class org.apache.spark.api.java.JavaPairRDD
first() - Method in interface org.apache.spark.api.java.JavaRDDLike: Return the first element in this RDD.
first() - Method in class org.apache.spark.rdd.RDD: Return the first element in this RDD.
first() - Method in class org.apache.spark.sql.Dataset: Returns the first row.
first(Column, boolean) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the first value in a group.
first(String, boolean) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the first value of a column in a group.
first(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the first value in a group.
first(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the first value of a column in a group.
firstFailureReason() - Method in class org.apache.spark.status.api.v1.streaming.BatchInfo
firstLaunchTime() - Method in class org.apache.spark.status.LiveStage
firstTaskLaunchedTime() - Method in class org.apache.spark.status.api.v1.StageData
fit(Dataset<?>) - Method in class org.apache.spark.ml.classification.OneVsRest
fit(Dataset<?>) - Method in class org.apache.spark.ml.clustering.BisectingKMeans
fit(Dataset<?>) - Method in class org.apache.spark.ml.clustering.GaussianMixture
fit(Dataset<?>) - Method in class org.apache.spark.ml.clustering.KMeans
fit(Dataset<?>) - Method in class org.apache.spark.ml.clustering.LDA
fit(Dataset<?>, ParamPair<?>, ParamPair<?>...) - Method in class org.apache.spark.ml.Estimator: Fits a single model to the input data with optional parameters.
fit(Dataset<?>, ParamPair<?>, Seq<ParamPair<?>>) - Method in class org.apache.spark.ml.Estimator: Fits a single model to the input data with optional parameters.
fit(Dataset<?>, ParamMap) - Method in class org.apache.spark.ml.Estimator: Fits a single model to the input data with provided parameter map.
fit(Dataset<?>) - Method in class org.apache.spark.ml.Estimator: Fits a model to the input data.
fit(Dataset<?>, ParamMap[]) - Method in class org.apache.spark.ml.Estimator: Fits multiple models to the input data with multiple sets of parameters.
fit(Dataset<?>) - Method in class org.apache.spark.ml.feature.ChiSqSelector
fit(Dataset<?>) - Method in class org.apache.spark.ml.feature.CountVectorizer
fit(Dataset<?>) - Method in class org.apache.spark.ml.feature.IDF
fit(Dataset<?>) - Method in class org.apache.spark.ml.feature.Imputer
fit(Dataset<?>) - Method in class org.apache.spark.ml.feature.MaxAbsScaler
fit(Dataset<?>) - Method in class org.apache.spark.ml.feature.MinMaxScaler
fit(Dataset<?>) - Method in class org.apache.spark.ml.feature.OneHotEncoderEstimator
fit(Dataset<?>) - Method in class org.apache.spark.ml.feature.PCA: Computes a PCAModel that contains the principal components of the input vectors.
fit(Dataset<?>) - Method in class org.apache.spark.ml.feature.QuantileDiscretizer
fit(Dataset<?>) - Method in class org.apache.spark.ml.feature.RFormula
fit(Dataset<?>) - Method in class org.apache.spark.ml.feature.StandardScaler
fit(Dataset<?>) - Method in class org.apache.spark.ml.feature.StringIndexer
fit(Dataset<?>) - Method in class org.apache.spark.ml.feature.VectorIndexer
fit(Dataset<?>) - Method in class org.apache.spark.ml.feature.Word2Vec
fit(Dataset<?>) - Method in class org.apache.spark.ml.fpm.FPGrowth
fit(Dataset<?>) - Method in class org.apache.spark.ml.Pipeline: Fits the pipeline to the input dataset with additional parameters.
fit(Dataset<?>) - Method in class org.apache.spark.ml.Predictor
fit(Dataset<?>) - Method in class org.apache.spark.ml.recommendation.ALS
fit(Dataset<?>) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
fit(Dataset<?>) - Method in class org.apache.spark.ml.regression.IsotonicRegression
fit(Dataset<?>) - Method in class org.apache.spark.ml.tuning.CrossValidator
fit(Dataset<?>) - Method in class org.apache.spark.ml.tuning.TrainValidationSplit
fit(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.feature.ChiSqSelector: Returns a ChiSquared feature selector.
fit(RDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDF: Computes the inverse document frequency.
fit(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDF: Computes the inverse document frequency.
fit(RDD<Vector>) - Method in class org.apache.spark.mllib.feature.PCA: Computes a PCAModel that contains the principal components of the input vectors.
fit(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.feature.PCA: Java-friendly version of fit().
fit(RDD<Vector>) - Method in class org.apache.spark.mllib.feature.StandardScaler: Computes the mean and variance and stores as a model to be used for later scaling.
fit(RDD<S>) - Method in class org.apache.spark.mllib.feature.Word2Vec: Computes the vector representation of each word in vocabulary.
fit(JavaRDD<S>) - Method in class org.apache.spark.mllib.feature.Word2Vec: Computes the vector representation of each word in vocabulary (Java version).
fitIntercept() - Method in interface org.apache.spark.ml.param.shared.HasFitIntercept: Param for whether to fit an intercept term.
Fixed$() - Constructor for class org.apache.spark.sql.types.DecimalType.Fixed$
flatMap(FlatMapFunction<T, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by first applying a function to all elements of this RDD, and then flattening the results.
flatMap(Function1<T, TraversableOnce>, ClassTag) - Method in class org.apache.spark.rdd.RDD: Return a new RDD by first applying a function to all elements of this RDD, and then flattening the results.
flatMap(Function1<T, TraversableOnce>, Encoder) - Method in class org.apache.spark.sql.Dataset: :: Experimental :: (Scala-specific) Returns a new Dataset by first applying a function to all elements of this Dataset, and then flattening the results.
flatMap(FlatMapFunction<T, U>, Encoder) - Method in class org.apache.spark.sql.Dataset: :: Experimental :: (Java-specific) Returns a new Dataset by first applying a function to all elements of this Dataset, and then flattening the results.
flatMap(FlatMapFunction<T, U>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream by applying a function to all elements of this DStream, and then flattening the results
flatMap(Function1<T, TraversableOnce>, ClassTag) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream by applying a function to all elements of this DStream, and then flattening the results
FlatMapFunction<T,R> - Interface in org.apache.spark.api.java.function: A function that returns zero or more output records from each input record.
FlatMapFunction2<T1,T2,R> - Interface in org.apache.spark.api.java.function: A function that takes two inputs and returns zero or more output records.
flatMapGroups(Function2<K, Iterator<V>, TraversableOnce>, Encoder) - Method in class org.apache.spark.sql.KeyValueGroupedDataset: (Scala-specific) Applies the given function to each group of data.
flatMapGroups(FlatMapGroupsFunction<K, V, U>, Encoder) - Method in class org.apache.spark.sql.KeyValueGroupedDataset: (Java-specific) Applies the given function to each group of data.
FlatMapGroupsFunction<K,V,R> - Interface in org.apache.spark.api.java.function: A function that returns zero or more output records from each grouping key and its values.
flatMapGroupsWithState(OutputMode, GroupStateTimeout, Function3<K, Iterator<V>, GroupState<S>, Iterator>, Encoder<S>, Encoder) - Method in class org.apache.spark.sql.KeyValueGroupedDataset: ::Experimental:: (Scala-specific) Applies the given function to each group of data, while maintaining a user-defined per-group state.
flatMapGroupsWithState(FlatMapGroupsWithStateFunction<K, V, S, U>, OutputMode, Encoder<S>, Encoder, GroupStateTimeout) - Method in class org.apache.spark.sql.KeyValueGroupedDataset: ::Experimental:: (Java-specific) Applies the given function to each group of data, while maintaining a user-defined per-group state.
FlatMapGroupsWithStateFunction<K,V,S,R> - Interface in org.apache.spark.api.java.function: ::Experimental:: Base interface for a map function used in org.apache.spark.sql.KeyValueGroupedDataset.flatMapGroupsWithState( FlatMapGroupsWithStateFunction, org.apache.spark.sql.streaming.OutputMode, org.apache.spark.sql.Encoder, org.apache.spark.sql.Encoder)
flatMapToDouble(DoubleFlatMapFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by first applying a function to all elements of this RDD, and then flattening the results.
flatMapToPair(PairFlatMapFunction<T, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by first applying a function to all elements of this RDD, and then flattening the results.
flatMapToPair(PairFlatMapFunction<T, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream by applying a function to all elements of this DStream, and then flattening the results
flatMapValues(Function<V, Iterable>) - Method in class org.apache.spark.api.java.JavaPairRDD: Pass each value in the key-value pair RDD through a flatMap function without changing the keys; this also retains the original RDD's partitioning.
flatMapValues(Function1<V, TraversableOnce>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Pass each value in the key-value pair RDD through a flatMap function without changing the keys; this also retains the original RDD's partitioning.
flatMapValues(Function<V, Iterable>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying a flatmap function to the value of each key-value pairs in 'this' DStream without changing the key.
flatMapValues(Function1<V, TraversableOnce>, ClassTag) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying a flatmap function to the value of each key-value pairs in 'this' DStream without changing the key.
flatten(Column) - Static method in class org.apache.spark.sql.functions: Creates a single array from an array of arrays.
FLOAT() - Static method in class org.apache.spark.sql.Encoders: An encoder for nullable float type.
FloatAccumulatorParam$() - Constructor for class org.apache.spark.AccumulatorParam.FloatAccumulatorParam$: Deprecated.
FloatParam - Class in org.apache.spark.ml.param: :: DeveloperApi :: Specialized version of Param[Float] for Java.
FloatParam(String, String, String, Function1<Object, Object>) - Constructor for class org.apache.spark.ml.param.FloatParam
FloatParam(String, String, String) - Constructor for class org.apache.spark.ml.param.FloatParam
FloatParam(Identifiable, String, String, Function1<Object, Object>) - Constructor for class org.apache.spark.ml.param.FloatParam
FloatParam(Identifiable, String, String) - Constructor for class org.apache.spark.ml.param.FloatParam
FloatType - Static variable in class org.apache.spark.sql.types.DataTypes: Gets the FloatType object.
FloatType - Class in org.apache.spark.sql.types: The data type representing Float values.
FloatType() - Constructor for class org.apache.spark.sql.types.FloatType
floor(Column) - Static method in class org.apache.spark.sql.functions: Computes the floor of the given value.
floor(String) - Static method in class org.apache.spark.sql.functions: Computes the floor of the given column.
floor() - Method in class org.apache.spark.sql.types.Decimal
floor(Duration) - Method in class org.apache.spark.streaming.Time
floor(Duration, Time) - Method in class org.apache.spark.streaming.Time
flush() - Method in class org.apache.spark.io.SnappyOutputStreamWrapper
flush() - Method in class org.apache.spark.serializer.SerializationStream
flush() - Method in class org.apache.spark.storage.TimeTrackingOutputStream
fMeasure(double, double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns f-measure for a given label (category)
fMeasure(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns f1-measure for a given label (category)
fMeasure() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Deprecated.
Use accuracy. Since 2.0.0.
fMeasureByLabel(double) - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary: Returns f-measure for each label (category).
fMeasureByLabel() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary: Returns f1-measure for each label (category).
fMeasureByThreshold() - Method in interface org.apache.spark.ml.classification.BinaryLogisticRegressionSummary: Returns a dataframe with two fields (threshold, F-Measure) curve with beta = 1.0.
fMeasureByThreshold(double) - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Returns the (threshold, F-Measure) curve.
fMeasureByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Returns the (threshold, F-Measure) curve with beta = 1.0.
fold(T, Function2<T, T, T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Aggregate the elements of each partition, and then the results for all the partitions, using a given associative function and a neutral "zero value".
fold(T, Function2<T, T, T>) - Method in class org.apache.spark.rdd.RDD: Aggregate the elements of each partition, and then the results for all the partitions, using a given associative function and a neutral "zero value".
foldByKey(V, Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD: Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g ., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foldByKey(V, int, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD: Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g ., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foldByKey(V, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD: Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foldByKey(V, Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foldByKey(V, int, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foldByKey(V, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
forceIndexLabel() - Method in interface org.apache.spark.ml.feature.RFormulaBase: Force to index label whether it is numeric or string type.
foreach(VoidFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Applies a function f to all elements of this RDD.
foreach(Function1<T, BoxedUnit>) - Method in class org.apache.spark.rdd.RDD: Applies a function f to all elements of this RDD.
foreach(Function1<T, BoxedUnit>) - Method in class org.apache.spark.sql.Dataset: Applies a function f to all rows.
foreach(ForeachFunction<T>) - Method in class org.apache.spark.sql.Dataset: (Java-specific) Runs func on each element of this Dataset.
foreach(ForeachWriter<T>) - Method in class org.apache.spark.sql.streaming.DataStreamWriter: Sets the output of the streaming query to be processed using the provided writer object.
foreachActive(Function3<Object, Object, Object, BoxedUnit>) - Method in class org.apache.spark.ml.linalg.DenseMatrix
foreachActive(Function2<Object, Object, BoxedUnit>) - Method in class org.apache.spark.ml.linalg.DenseVector
foreachActive(Function3<Object, Object, Object, BoxedUnit>) - Method in interface org.apache.spark.ml.linalg.Matrix: Applies a function f to all the active elements of dense and sparse matrix.
foreachActive(Function3<Object, Object, Object, BoxedUnit>) - Method in class org.apache.spark.ml.linalg.SparseMatrix
foreachActive(Function2<Object, Object, BoxedUnit>) - Method in class org.apache.spark.ml.linalg.SparseVector
foreachActive(Function2<Object, Object, BoxedUnit>) - Method in interface org.apache.spark.ml.linalg.Vector: Applies a function f to all the active elements of dense and sparse vector.
foreachActive(Function2<Object, Object, BoxedUnit>) - Method in class org.apache.spark.mllib.linalg.DenseVector
foreachActive(Function3<Object, Object, Object, BoxedUnit>) - Method in interface org.apache.spark.mllib.linalg.Matrix: Applies a function f to all the active elements of dense and sparse matrix.
foreachActive(Function2<Object, Object, BoxedUnit>) - Method in class org.apache.spark.mllib.linalg.SparseVector
foreachActive(Function2<Object, Object, BoxedUnit>) - Method in interface org.apache.spark.mllib.linalg.Vector: Applies a function f to all the active elements of dense and sparse vector.
foreachAsync(VoidFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: The asynchronous version of the foreach action, which applies a function f to all the elements of this RDD.
foreachAsync(Function1<T, BoxedUnit>) - Method in class org.apache.spark.rdd.AsyncRDDActions: Applies a function f to all elements of this RDD.
foreachBatch(Function2<Dataset<T>, Object, BoxedUnit>) - Method in class org.apache.spark.sql.streaming.DataStreamWriter: :: Experimental ::
foreachBatch(VoidFunction2<Dataset<T>, Long>) - Method in class org.apache.spark.sql.streaming.DataStreamWriter: :: Experimental ::
ForeachFunction<T> - Interface in org.apache.spark.api.java.function: Base interface for a function used in Dataset's foreach function.
foreachPartition(VoidFunction<Iterator<T>>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Applies a function f to each partition of this RDD.
foreachPartition(Function1<Iterator<T>, BoxedUnit>) - Method in class org.apache.spark.rdd.RDD: Applies a function f to each partition of this RDD.
foreachPartition(Function1<Iterator<T>, BoxedUnit>) - Method in class org.apache.spark.sql.Dataset: Applies a function f to each partition of this Dataset.
foreachPartition(ForeachPartitionFunction<T>) - Method in class org.apache.spark.sql.Dataset: (Java-specific) Runs func on each partition of this Dataset.
foreachPartitionAsync(VoidFunction<Iterator<T>>) - Method in interface org.apache.spark.api.java.JavaRDDLike: The asynchronous version of the foreachPartition action, which applies a function f to each partition of this RDD.
foreachPartitionAsync(Function1<Iterator<T>, BoxedUnit>) - Method in class org.apache.spark.rdd.AsyncRDDActions: Applies a function f to each partition of this RDD.
ForeachPartitionFunction<T> - Interface in org.apache.spark.api.java.function: Base interface for a function used in Dataset's foreachPartition function.
foreachRDD(VoidFunction<R>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Apply a function to each RDD in this DStream.
foreachRDD(VoidFunction2<R, Time>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Apply a function to each RDD in this DStream.
foreachRDD(Function1<RDD<T>, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream: Apply a function to each RDD in this DStream.
foreachRDD(Function2<RDD<T>, Time, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream: Apply a function to each RDD in this DStream.
ForeachWriter<T> - Class in org.apache.spark.sql: The abstract class for writing custom logic to process data generated by a query.
ForeachWriter() - Constructor for class org.apache.spark.sql.ForeachWriter
format() - Method in class org.apache.spark.ml.clustering.InternalKMeansModelWriter
format() - Method in class org.apache.spark.ml.clustering.PMMLKMeansModelWriter
format() - Method in class org.apache.spark.ml.regression.InternalLinearRegressionModelWriter
format() - Method in class org.apache.spark.ml.regression.PMMLLinearRegressionModelWriter
format(String) - Method in class org.apache.spark.ml.util.GeneralMLWriter: Specifies the format of ML export (e.g.
format() - Method in interface org.apache.spark.ml.util.MLFormatRegister: The string that represents the format that this format provider uses.
format(String) - Method in class org.apache.spark.sql.DataFrameReader: Specifies the input data source format.
format(String) - Method in class org.apache.spark.sql.DataFrameWriter: Specifies the underlying output data source.
format(String) - Method in class org.apache.spark.sql.streaming.DataStreamReader: Specifies the input data source format.
format(String) - Method in class org.apache.spark.sql.streaming.DataStreamWriter: Specifies the underlying output data source.
format_number(Column, int) - Static method in class org.apache.spark.sql.functions: Formats numeric column x to a format like '#,###,###.##', rounded to d decimal places with HALF_EVEN round mode, and returns the result as a string column.
format_string(String, Column...) - Static method in class org.apache.spark.sql.functions: Formats the arguments in printf-style and returns the result as a string column.
format_string(String, Seq<Column>) - Static method in class org.apache.spark.sql.functions: Formats the arguments in printf-style and returns the result as a string column.
formatBatchTime(long, long, boolean, TimeZone) - Static method in class org.apache.spark.streaming.ui.UIUtils: If batchInterval is less than 1 second, format batchTime with milliseconds.
formatDate(Date) - Static method in class org.apache.spark.ui.UIUtils
formatDate(long) - Static method in class org.apache.spark.ui.UIUtils
formatDuration(long) - Static method in class org.apache.spark.ui.UIUtils
formatDurationVerbose(long) - Static method in class org.apache.spark.ui.UIUtils: Generate a verbose human-readable string representing a duration such as "5 second 35 ms"
formatNumber(double) - Static method in class org.apache.spark.ui.UIUtils: Generate a human-readable string representing a number (e.g.
formatVersion() - Method in interface org.apache.spark.mllib.util.Saveable: Current version of model save/load format.
formula() - Method in interface org.apache.spark.ml.feature.RFormulaBase: R formula parameter.
forward(DenseMatrix<Object>, boolean) - Method in interface org.apache.spark.ml.ann.TopologyModel: Forward propagation
FPGrowth - Class in org.apache.spark.ml.fpm: :: Experimental :: A parallel FP-growth algorithm to mine frequent itemsets.
FPGrowth(String) - Constructor for class org.apache.spark.ml.fpm.FPGrowth
FPGrowth() - Constructor for class org.apache.spark.ml.fpm.FPGrowth
FPGrowth - Class in org.apache.spark.mllib.fpm: A parallel FP-growth algorithm to mine frequent itemsets.
FPGrowth() - Constructor for class org.apache.spark.mllib.fpm.FPGrowth: Constructs a default instance with default parameters {minSupport: 0.3, numPartitions: same as the input data}.
FPGrowth.FreqItemset<Item> - Class in org.apache.spark.mllib.fpm: Frequent itemset.
FPGrowthModel - Class in org.apache.spark.ml.fpm: :: Experimental :: Model fitted by FPGrowth.
FPGrowthModel<Item> - Class in org.apache.spark.mllib.fpm: Model trained by FPGrowth, which holds frequent itemsets.
FPGrowthModel(RDD<FPGrowth.FreqItemset<Item>>, Map<Item, Object>, ClassTag<Item>) - Constructor for class org.apache.spark.mllib.fpm.FPGrowthModel
FPGrowthModel(RDD<FPGrowth.FreqItemset<Item>>, ClassTag<Item>) - Constructor for class org.apache.spark.mllib.fpm.FPGrowthModel
FPGrowthModel.SaveLoadV1_0$ - Class in org.apache.spark.mllib.fpm
FPGrowthParams - Interface in org.apache.spark.ml.fpm: Common params for FPGrowth and FPGrowthModel
fpr() - Method in interface org.apache.spark.ml.feature.ChiSqSelectorParams: The highest p-value for features to be kept.
fpr() - Method in class org.apache.spark.mllib.feature.ChiSqSelector
freq() - Method in class org.apache.spark.mllib.fpm.FPGrowth.FreqItemset
freq() - Method in class org.apache.spark.mllib.fpm.PrefixSpan.FreqSequence
freqItems(String[], double) - Method in class org.apache.spark.sql.DataFrameStatFunctions: Finding frequent items for columns, possibly with false positives.
freqItems(String[]) - Method in class org.apache.spark.sql.DataFrameStatFunctions: Finding frequent items for columns, possibly with false positives.
freqItems(Seq<String>, double) - Method in class org.apache.spark.sql.DataFrameStatFunctions: (Scala-specific) Finding frequent items for columns, possibly with false positives.
freqItems(Seq<String>) - Method in class org.apache.spark.sql.DataFrameStatFunctions: (Scala-specific) Finding frequent items for columns, possibly with false positives.
FreqItemset(Object, long) - Constructor for class org.apache.spark.mllib.fpm.FPGrowth.FreqItemset
freqItemsets() - Method in class org.apache.spark.ml.fpm.FPGrowthModel
freqItemsets() - Method in class org.apache.spark.mllib.fpm.FPGrowthModel
FreqSequence(Object[], long) - Constructor for class org.apache.spark.mllib.fpm.PrefixSpan.FreqSequence
freqSequences() - Method in class org.apache.spark.mllib.fpm.PrefixSpanModel
from_json(Column, StructType, Map<String, String>) - Static method in class org.apache.spark.sql.functions: (Scala-specific) Parses a column containing a JSON string into a StructType with the specified schema.
from_json(Column, DataType, Map<String, String>) - Static method in class org.apache.spark.sql.functions: (Scala-specific) Parses a column containing a JSON string into a MapType with StringType as keys type, StructType or ArrayType with the specified schema.
from_json(Column, StructType, Map<String, String>) - Static method in class org.apache.spark.sql.functions: (Java-specific) Parses a column containing a JSON string into a StructType with the specified schema.
from_json(Column, DataType, Map<String, String>) - Static method in class org.apache.spark.sql.functions: (Java-specific) Parses a column containing a JSON string into a MapType with StringType as keys type, StructType or ArrayType with the specified schema.
from_json(Column, StructType) - Static method in class org.apache.spark.sql.functions: Parses a column containing a JSON string into a StructType with the specified schema.
from_json(Column, DataType) - Static method in class org.apache.spark.sql.functions: Parses a column containing a JSON string into a MapType with StringType as keys type, StructType or ArrayType with the specified schema.
from_json(Column, String, Map<String, String>) - Static method in class org.apache.spark.sql.functions: (Java-specific) Parses a column containing a JSON string into a MapType with StringType as keys type, StructType or ArrayType with the specified schema.
from_json(Column, String, Map<String, String>) - Static method in class org.apache.spark.sql.functions: (Scala-specific) Parses a column containing a JSON string into a MapType with StringType as keys type, StructType or ArrayType with the specified schema.
from_json(Column, Column) - Static method in class org.apache.spark.sql.functions: (Scala-specific) Parses a column containing a JSON string into a MapType with StringType as keys type, StructType or ArrayType of StructTypes with the specified schema.
from_json(Column, Column, Map<String, String>) - Static method in class org.apache.spark.sql.functions: (Java-specific) Parses a column containing a JSON string into a MapType with StringType as keys type, StructType or ArrayType of StructTypes with the specified schema.
from_unixtime(Column) - Static method in class org.apache.spark.sql.functions: Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a string representing the timestamp of that moment in the current system time zone in the yyyy-MM-dd HH:mm:ss format.
from_unixtime(Column, String) - Static method in class org.apache.spark.sql.functions: Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a string representing the timestamp of that moment in the current system time zone in the given format.
from_utc_timestamp(Column, String) - Static method in class org.apache.spark.sql.functions: Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in UTC, and renders that time as a timestamp in the given time zone.
from_utc_timestamp(Column, Column) - Static method in class org.apache.spark.sql.functions: Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in UTC, and renders that time as a timestamp in the given time zone.
fromCOO(int, int, Iterable<Tuple3<Object, Object, Object>>) - Static method in class org.apache.spark.ml.linalg.SparseMatrix: Generate a SparseMatrix from Coordinate List (COO) format.
fromCOO(int, int, Iterable<Tuple3<Object, Object, Object>>) - Static method in class org.apache.spark.mllib.linalg.SparseMatrix: Generate a SparseMatrix from Coordinate List (COO) format.
fromDDL(String) - Static method in class org.apache.spark.sql.types.DataType
fromDDL(String) - Static method in class org.apache.spark.sql.types.StructType: Creates StructType for a given DDL-formatted string, which is a comma separated list of field definitions, e.g., a INT, b STRING.
fromDecimal(Object) - Static method in class org.apache.spark.sql.types.Decimal
fromDStream(DStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaDStream: Convert a scala DStream to a Java-friendly JavaDStream.
fromEdgePartitions(RDD<Tuple2<Object, EdgePartition<ED, VD>>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl: Create a graph from EdgePartitions, setting referenced vertices to defaultVertexAttr.
fromEdges(RDD<Edge<ED>>, ClassTag<ED>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.EdgeRDD: Creates an EdgeRDD from a set of edges.
fromEdges(RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.Graph: Construct a graph from a collection of edges.
fromEdges(EdgeRDD<?>, int, VD, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD: Constructs a VertexRDD containing all vertices referred to in edges.
fromEdgeTuples(RDD<Tuple2<Object, Object>>, VD, Option<PartitionStrategy>, StorageLevel, StorageLevel, ClassTag<VD>) - Static method in class org.apache.spark.graphx.Graph: Construct a graph from a collection of edges encoded as vertex id pairs.
fromExistingRDDs(VertexRDD<VD>, EdgeRDD<ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl: Create a graph from a VertexRDD and an EdgeRDD with the same replicated vertex type as the vertices.
fromInputDStream(InputDStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaInputDStream: Convert a scala InputDStream to a Java-friendly JavaInputDStream.
fromInputDStream(InputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairInputDStream: Convert a scala InputDStream of pairs to a Java-friendly JavaPairInputDStream.
fromInt(int) - Method in interface org.apache.spark.sql.types.Decimal.DecimalIsConflicted
fromJavaDStream(JavaDStream<Tuple2<K, V>>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
fromJavaRDD(JavaRDD<Tuple2<K, V>>) - Static method in class org.apache.spark.api.java.JavaPairRDD: Convert a JavaRDD of key-value pairs to JavaPairRDD.
fromJson(String) - Static method in class org.apache.spark.ml.linalg.JsonMatrixConverter: Parses the JSON representation of a Matrix into a Matrix.
fromJson(String) - Static method in class org.apache.spark.ml.linalg.JsonVectorConverter: Parses the JSON representation of a vector into a Vector.
fromJson(String) - Static method in class org.apache.spark.mllib.linalg.Vectors: Parses the JSON representation of a vector into a Vector.
fromJson(String) - Static method in class org.apache.spark.sql.types.DataType
fromJson(String) - Static method in class org.apache.spark.sql.types.Metadata: Creates a Metadata instance from JSON.
fromKinesisInitialPosition(InitialPositionInStream) - Static method in class org.apache.spark.streaming.kinesis.KinesisInitialPositions: Returns instance of [[KinesisInitialPosition]] based on the passed [[InitialPositionInStream]].
fromMetadata(Metadata) - Method in interface org.apache.spark.ml.attribute.AttributeFactory: Creates an Attribute from a Metadata instance.
fromML(DenseMatrix) - Static method in class org.apache.spark.mllib.linalg.DenseMatrix: Convert new linalg type to spark.mllib type.
fromML(DenseVector) - Static method in class org.apache.spark.mllib.linalg.DenseVector: Convert new linalg type to spark.mllib type.
fromML(Matrix) - Static method in class org.apache.spark.mllib.linalg.Matrices: Convert new linalg type to spark.mllib type.
fromML(SparseMatrix) - Static method in class org.apache.spark.mllib.linalg.SparseMatrix: Convert new linalg type to spark.mllib type.
fromML(SparseVector) - Static method in class org.apache.spark.mllib.linalg.SparseVector: Convert new linalg type to spark.mllib type.
fromML(Vector) - Static method in class org.apache.spark.mllib.linalg.Vectors: Convert new linalg type to spark.mllib type.
fromName(String) - Static method in class org.apache.spark.ml.attribute.AttributeType: Gets the AttributeType object from its name.
fromNullable(T) - Static method in class org.apache.spark.api.java.Optional
fromOld(Node, Map<Object, Object>) - Static method in class org.apache.spark.ml.tree.Node: Create a new Node from the old Node format, recursively creating child nodes as needed.
fromPairDStream(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
fromPairRDD(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.mllib.rdd.MLPairRDDFunctions: Implicit conversion from a pair RDD to MLPairRDDFunctions.
fromParams(GeneralizedLinearRegressionBase) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Family$: Gets the Family object based on param family and variancePower.
fromParams(GeneralizedLinearRegressionBase) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Link$: Gets the Link object based on param family, link and linkPower.
fromRDD(RDD<Object>) - Static method in class org.apache.spark.api.java.JavaDoubleRDD
fromRDD(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.api.java.JavaPairRDD
fromRDD(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.api.java.JavaRDD
fromRDD(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.mllib.rdd.RDDFunctions: Implicit conversion from an RDD to RDDFunctions.
fromRdd(RDD<?>) - Static method in class org.apache.spark.storage.RDDInfo
fromReceiverInputDStream(ReceiverInputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream: Convert a scala ReceiverInputDStream to a Java-friendly JavaReceiverInputDStream.
fromReceiverInputDStream(ReceiverInputDStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream: Convert a scala ReceiverInputDStream to a Java-friendly JavaReceiverInputDStream.
fromSparkContext(SparkContext) - Static method in class org.apache.spark.api.java.JavaSparkContext
fromStage(Stage, int, Option<Object>, TaskMetrics, Seq<Seq<TaskLocation>>) - Static method in class org.apache.spark.scheduler.StageInfo: Construct a StageInfo from a Stage.
fromString(String) - Static method in enum org.apache.spark.JobExecutionStatus
fromString(String) - Static method in class org.apache.spark.mllib.tree.impurity.Impurities
fromString(String) - Static method in class org.apache.spark.mllib.tree.loss.Losses
fromString(String) - Static method in enum org.apache.spark.status.api.v1.ApplicationStatus
fromString(String) - Static method in enum org.apache.spark.status.api.v1.StageStatus
fromString(String) - Static method in enum org.apache.spark.status.api.v1.streaming.BatchStatus
fromString(String) - Static method in enum org.apache.spark.status.api.v1.TaskSorting
fromString(String) - Static method in class org.apache.spark.storage.StorageLevel: :: DeveloperApi :: Return the StorageLevel object with the specified name.
fromStructField(StructField) - Static method in class org.apache.spark.ml.attribute.Attribute
fromStructField(StructField) - Method in interface org.apache.spark.ml.attribute.AttributeFactory: Creates an Attribute from a StructField instance.
fromStructField(StructField) - Static method in class org.apache.spark.ml.attribute.AttributeGroup: Creates an attribute group from a StructField instance.
fromStructField(StructField) - Static method in class org.apache.spark.ml.attribute.BinaryAttribute
fromStructField(StructField) - Static method in class org.apache.spark.ml.attribute.NominalAttribute
fromStructField(StructField) - Static method in class org.apache.spark.ml.attribute.NumericAttribute
fullOuterJoin(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Perform a full outer join of this and other.
fullOuterJoin(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD: Perform a full outer join of this and other.
fullOuterJoin(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Perform a full outer join of this and other.
fullOuterJoin(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions: Perform a full outer join of this and other.
fullOuterJoin(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Perform a full outer join of this and other.
fullOuterJoin(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: Perform a full outer join of this and other.
fullOuterJoin(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'full outer join' between RDDs of this DStream and other DStream.
fullOuterJoin(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'full outer join' between RDDs of this DStream and other DStream.
fullOuterJoin(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'full outer join' between RDDs of this DStream and other DStream.
fullOuterJoin(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'full outer join' between RDDs of this DStream and other DStream.
fullOuterJoin(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'full outer join' between RDDs of this DStream and other DStream.
fullOuterJoin(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'full outer join' between RDDs of this DStream and other DStream.
fullStackTrace() - Method in class org.apache.spark.ExceptionFailure
Function<T1,R> - Interface in org.apache.spark.api.java.function: Base interface for functions whose return types do not create special RDDs.
Function - Class in org.apache.spark.sql.catalog: A user-defined function in Spark, as returned by listFunctions method in Catalog.
Function(String, String, String, String, boolean) - Constructor for class org.apache.spark.sql.catalog.Function
function(Function4<Time, KeyType, Option<ValueType>, State<StateType>, Option<MappedType>>) - Static method in class org.apache.spark.streaming.StateSpec: Create a StateSpec for setting all the specifications of the mapWithState operation on a pair DStream.
function(Function3<KeyType, Option<ValueType>, State<StateType>, MappedType>) - Static method in class org.apache.spark.streaming.StateSpec: Create a StateSpec for setting all the specifications of the mapWithState operation on a pair DStream.
function(Function4<Time, KeyType, Optional<ValueType>, State<StateType>, Optional<MappedType>>) - Static method in class org.apache.spark.streaming.StateSpec: Create a StateSpec for setting all the specifications of the mapWithState operation on a JavaPairDStream.
function(Function3<KeyType, Optional<ValueType>, State<StateType>, MappedType>) - Static method in class org.apache.spark.streaming.StateSpec: Create a StateSpec for setting all the specifications of the mapWithState operation on a JavaPairDStream.
Function0<R> - Interface in org.apache.spark.api.java.function: A zero-argument function that returns an R.
Function2<T1,T2,R> - Interface in org.apache.spark.api.java.function: A two-argument function that takes arguments of type T1 and T2 and returns an R.
Function3<T1,T2,T3,R> - Interface in org.apache.spark.api.java.function: A three-argument function that takes arguments of type T1, T2 and T3 and returns an R.
Function4<T1,T2,T3,T4,R> - Interface in org.apache.spark.api.java.function: A four-argument function that takes arguments of type T1, T2, T3 and T4 and returns an R.
functionExists(String) - Method in class org.apache.spark.sql.catalog.Catalog: Check if the function with the specified name exists.
functionExists(String, String) - Method in class org.apache.spark.sql.catalog.Catalog: Check if the function with the specified name exists in the specified database.
functionExists(String, String) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Return whether a function exists in the specified database.
functions - Class in org.apache.spark.sql: Commonly used functions available for DataFrame operations.
functions() - Constructor for class org.apache.spark.sql.functions
FutureAction<T> - Interface in org.apache.spark: A future for the result of an action to support cancellation.
futureExecutionContext() - Static method in class org.apache.spark.rdd.AsyncRDDActions
fwe() - Method in interface org.apache.spark.ml.feature.ChiSqSelectorParams: The upper bound of the expected family-wise error rate.
fwe() - Method in class org.apache.spark.mllib.feature.ChiSqSelector

G

gain() - Method in class org.apache.spark.ml.tree.DecisionTreeModelReadWrite.NodeData
gain() - Method in class org.apache.spark.ml.tree.InternalNode
gain() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
Gamma$() - Constructor for class org.apache.spark.ml.regression.GeneralizedLinearRegression.Gamma$
gamma1() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
gamma2() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
gamma6() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
gamma7() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
GammaGenerator - Class in org.apache.spark.mllib.random: :: DeveloperApi :: Generates i.i.d.
GammaGenerator(double, double) - Constructor for class org.apache.spark.mllib.random.GammaGenerator
gammaJavaRDD(JavaSparkContext, double, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Java-friendly version of RandomRDDs.gammaRDD.
gammaJavaRDD(JavaSparkContext, double, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.gammaJavaRDD with the default seed.
gammaJavaRDD(JavaSparkContext, double, double, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.gammaJavaRDD with the default number of partitions and the default seed.
gammaJavaVectorRDD(JavaSparkContext, double, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Java-friendly version of RandomRDDs.gammaVectorRDD.
gammaJavaVectorRDD(JavaSparkContext, double, double, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.gammaJavaVectorRDD with the default seed.
gammaJavaVectorRDD(JavaSparkContext, double, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.gammaJavaVectorRDD with the default number of partitions and the default seed.
gammaRDD(SparkContext, double, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Generates an RDD comprised of i.i.d. samples from the gamma distribution with the input shape and scale.
gammaVectorRDD(SparkContext, double, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Generates an RDD[Vector] with vectors containing i.i.d. samples drawn from the gamma distribution with the input shape and scale.
gapply(RelationalGroupedDataset, byte[], byte[], Object[], StructType) - Static method in class org.apache.spark.sql.api.r.SQLUtils: The helper function for gapply() on R side.
gaps() - Method in class org.apache.spark.ml.feature.RegexTokenizer: Indicates whether regex splits on gaps (true) or matches tokens (false).
GAUGE() - Static method in class org.apache.spark.metrics.sink.StatsdMetricType
Gaussian$() - Constructor for class org.apache.spark.ml.regression.GeneralizedLinearRegression.Gaussian$
GaussianMixture - Class in org.apache.spark.ml.clustering: Gaussian Mixture clustering.
GaussianMixture(String) - Constructor for class org.apache.spark.ml.clustering.GaussianMixture
GaussianMixture() - Constructor for class org.apache.spark.ml.clustering.GaussianMixture
GaussianMixture - Class in org.apache.spark.mllib.clustering: This class performs expectation maximization for multivariate Gaussian Mixture Models (GMMs).
GaussianMixture() - Constructor for class org.apache.spark.mllib.clustering.GaussianMixture: Constructs a default instance.
GaussianMixtureModel - Class in org.apache.spark.ml.clustering: Multivariate Gaussian Mixture Model (GMM) consisting of k Gaussians, where points are drawn from each Gaussian i with probability weights(i).
GaussianMixtureModel - Class in org.apache.spark.mllib.clustering: Multivariate Gaussian Mixture Model (GMM) consisting of k Gaussians, where points are drawn from each Gaussian i=1..k with probability w(i); mu(i) and sigma(i) are the respective mean and covariance for each Gaussian distribution i=1..k.
GaussianMixtureModel(double[], MultivariateGaussian[]) - Constructor for class org.apache.spark.mllib.clustering.GaussianMixtureModel
GaussianMixtureParams - Interface in org.apache.spark.ml.clustering: Common params for GaussianMixture and GaussianMixtureModel
GaussianMixtureSummary - Class in org.apache.spark.ml.clustering: :: Experimental :: Summary of GaussianMixture.
gaussians() - Method in class org.apache.spark.ml.clustering.GaussianMixtureModel
gaussians() - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
gaussiansDF() - Method in class org.apache.spark.ml.clustering.GaussianMixtureModel: Retrieve Gaussian distributions as a DataFrame.
GBTClassificationModel - Class in org.apache.spark.ml.classification: Gradient-Boosted Trees (GBTs) (http://en.wikipedia.org/wiki/Gradient_boosting) model for classification.
GBTClassificationModel(String, DecisionTreeRegressionModel[], double[]) - Constructor for class org.apache.spark.ml.classification.GBTClassificationModel: Construct a GBTClassificationModel
GBTClassifier - Class in org.apache.spark.ml.classification: Gradient-Boosted Trees (GBTs) (http://en.wikipedia.org/wiki/Gradient_boosting) learning algorithm for classification.
GBTClassifier(String) - Constructor for class org.apache.spark.ml.classification.GBTClassifier
GBTClassifier() - Constructor for class org.apache.spark.ml.classification.GBTClassifier
GBTClassifierParams - Interface in org.apache.spark.ml.tree
GBTParams - Interface in org.apache.spark.ml.tree: Parameters for Gradient-Boosted Tree algorithms.
GBTRegressionModel - Class in org.apache.spark.ml.regression: Gradient-Boosted Trees (GBTs) model for regression.
GBTRegressionModel(String, DecisionTreeRegressionModel[], double[]) - Constructor for class org.apache.spark.ml.regression.GBTRegressionModel: Construct a GBTRegressionModel
GBTRegressor - Class in org.apache.spark.ml.regression: Gradient-Boosted Trees (GBTs) learning algorithm for regression.
GBTRegressor(String) - Constructor for class org.apache.spark.ml.regression.GBTRegressor
GBTRegressor() - Constructor for class org.apache.spark.ml.regression.GBTRegressor
GBTRegressorParams - Interface in org.apache.spark.ml.tree
GC_TIME() - Static method in class org.apache.spark.status.TaskIndexNames
GC_TIME() - Static method in class org.apache.spark.ui.ToolTips
gemm(double, Matrix, DenseMatrix, double, DenseMatrix) - Static method in class org.apache.spark.ml.linalg.BLAS: C := alpha * A * B + beta * C
gemm(double, Matrix, DenseMatrix, double, DenseMatrix) - Static method in class org.apache.spark.mllib.linalg.BLAS: C := alpha * A * B + beta * C
gemv(double, Matrix, Vector, double, DenseVector) - Static method in class org.apache.spark.ml.linalg.BLAS: y := alpha * A * x + beta * y
gemv(double, Matrix, Vector, double, DenseVector) - Static method in class org.apache.spark.mllib.linalg.BLAS: y := alpha * A * x + beta * y
GeneralizedLinearAlgorithm<M extends GeneralizedLinearModel> - Class in org.apache.spark.mllib.regression: :: DeveloperApi :: GeneralizedLinearAlgorithm implements methods to train a Generalized Linear Model (GLM).
GeneralizedLinearAlgorithm() - Constructor for class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
GeneralizedLinearModel - Class in org.apache.spark.mllib.regression: :: DeveloperApi :: GeneralizedLinearModel (GLM) represents a model trained using GeneralizedLinearAlgorithm.
GeneralizedLinearModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.GeneralizedLinearModel
GeneralizedLinearRegression - Class in org.apache.spark.ml.regression: :: Experimental ::
GeneralizedLinearRegression(String) - Constructor for class org.apache.spark.ml.regression.GeneralizedLinearRegression
GeneralizedLinearRegression() - Constructor for class org.apache.spark.ml.regression.GeneralizedLinearRegression
GeneralizedLinearRegression.Binomial$ - Class in org.apache.spark.ml.regression: Binomial exponential family distribution.
GeneralizedLinearRegression.CLogLog$ - Class in org.apache.spark.ml.regression
GeneralizedLinearRegression.Family$ - Class in org.apache.spark.ml.regression
GeneralizedLinearRegression.FamilyAndLink$ - Class in org.apache.spark.ml.regression
GeneralizedLinearRegression.Gamma$ - Class in org.apache.spark.ml.regression: Gamma exponential family distribution.
GeneralizedLinearRegression.Gaussian$ - Class in org.apache.spark.ml.regression: Gaussian exponential family distribution.
GeneralizedLinearRegression.Identity$ - Class in org.apache.spark.ml.regression
GeneralizedLinearRegression.Inverse$ - Class in org.apache.spark.ml.regression
GeneralizedLinearRegression.Link$ - Class in org.apache.spark.ml.regression
GeneralizedLinearRegression.Log$ - Class in org.apache.spark.ml.regression
GeneralizedLinearRegression.Logit$ - Class in org.apache.spark.ml.regression
GeneralizedLinearRegression.Poisson$ - Class in org.apache.spark.ml.regression: Poisson exponential family distribution.
GeneralizedLinearRegression.Probit$ - Class in org.apache.spark.ml.regression
GeneralizedLinearRegression.Sqrt$ - Class in org.apache.spark.ml.regression
GeneralizedLinearRegression.Tweedie$ - Class in org.apache.spark.ml.regression
GeneralizedLinearRegressionBase - Interface in org.apache.spark.ml.regression: Params for Generalized Linear Regression.
GeneralizedLinearRegressionModel - Class in org.apache.spark.ml.regression: :: Experimental :: Model produced by GeneralizedLinearRegression.
GeneralizedLinearRegressionSummary - Class in org.apache.spark.ml.regression: :: Experimental :: Summary of GeneralizedLinearRegression model and predictions.
GeneralizedLinearRegressionTrainingSummary - Class in org.apache.spark.ml.regression: :: Experimental :: Summary of GeneralizedLinearRegression fitting and model.
GeneralMLWritable - Interface in org.apache.spark.ml.util: Trait for classes that provide GeneralMLWriter.
GeneralMLWriter - Class in org.apache.spark.ml.util: A ML Writer which delegates based on the requested format.
GeneralMLWriter(PipelineStage) - Constructor for class org.apache.spark.ml.util.GeneralMLWriter
generateAssociationRules(double) - Method in class org.apache.spark.mllib.fpm.FPGrowthModel: Generates association rules for the Items in freqItemsets.
generateKMeansRDD(SparkContext, int, int, int, double, int) - Static method in class org.apache.spark.mllib.util.KMeansDataGenerator: Generate an RDD containing test data for KMeans.
generateLinearInput(double, double[], int, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator: For compatibility, the generated data without specifying the mean and variance will have zero mean and variance of (1.0/3.0) since the original output range is [-1, 1] with uniform distribution, and the variance of uniform distribution is (b - a)^2^ / 12 which will be (1.0/3.0)
generateLinearInput(double, double[], double[], double[], int, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
generateLinearInput(double, double[], double[], double[], int, int, double, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
generateLinearInputAsList(double, double[], int, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator: Return a Java List of synthetic data randomly generated according to a multi collinear model.
generateLinearRDD(SparkContext, int, int, double, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator: Generate an RDD containing sample data for Linear Regression models - including Ridge, Lasso, and unregularized variants.
generateLogisticRDD(SparkContext, int, int, double, int, double) - Static method in class org.apache.spark.mllib.util.LogisticRegressionDataGenerator: Generate an RDD containing test data for LogisticRegression.
generateRandomEdges(int, int, int, long) - Static method in class org.apache.spark.graphx.util.GraphGenerators
generateRolledOverFileSuffix() - Method in interface org.apache.spark.util.logging.RollingPolicy: Get the desired name of the rollover file
geq(Object) - Method in class org.apache.spark.sql.Column: Greater than or equal to an expression.
get(Object) - Method in class org.apache.spark.api.java.JavaUtils.SerializableMapWrapper
get() - Method in class org.apache.spark.api.java.Optional
get() - Static method in class org.apache.spark.BarrierTaskContext: :: Experimental :: Returns the currently active BarrierTaskContext.
get() - Method in interface org.apache.spark.FutureAction: Blocks and returns the result of this job.
get(String) - Method in interface org.apache.spark.internal.config.ConfigProvider
get(Param<T>) - Method in class org.apache.spark.ml.param.ParamMap: Optionally returns the value associated with a param.
get(Param<T>) - Method in interface org.apache.spark.ml.param.Params: Optionally returns the user-supplied value of a param.
get(String) - Method in class org.apache.spark.SparkConf: Get a parameter; throws a NoSuchElementException if it's not set
get(String, String) - Method in class org.apache.spark.SparkConf: Get a parameter, falling back to a default if not set
get() - Static method in class org.apache.spark.SparkEnv: Returns the SparkEnv.
get(String) - Static method in class org.apache.spark.SparkFiles: Get the absolute path of a file added through SparkContext.addFile().
get(String) - Static method in class org.apache.spark.sql.jdbc.JdbcDialects: Fetch the JdbcDialect class corresponding to a given database url.
get(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i.
get(String) - Method in class org.apache.spark.sql.RuntimeConfig: Returns the value of Spark runtime configuration property for the given key.
get(String, String) - Method in class org.apache.spark.sql.RuntimeConfig: Returns the value of Spark runtime configuration property for the given key.
get(String) - Method in class org.apache.spark.sql.sources.v2.DataSourceOptions: Returns the option value to which the specified key is mapped, case-insensitively.
get() - Method in interface org.apache.spark.sql.sources.v2.reader.InputPartitionReader: Return the current record.
get() - Method in interface org.apache.spark.sql.streaming.GroupState: Get the state value if it exists, or throw NoSuchElementException.
get(UUID) - Method in class org.apache.spark.sql.streaming.StreamingQueryManager: Returns the query if there is an active query with the given id, or null.
get(String) - Method in class org.apache.spark.sql.streaming.StreamingQueryManager: Returns the query if there is an active query with the given id, or null.
get(int, DataType) - Method in class org.apache.spark.sql.vectorized.ColumnarArray
get(int, DataType) - Method in class org.apache.spark.sql.vectorized.ColumnarRow
get() - Method in class org.apache.spark.streaming.State: Get the state if it exists, otherwise it will throw java.util.NoSuchElementException.
get() - Static method in class org.apache.spark.TaskContext: Return the currently active TaskContext.
get(long) - Static method in class org.apache.spark.util.AccumulatorContext: Returns the AccumulatorV2 registered with the given ID, if any.
get_json_object(Column, String) - Static method in class org.apache.spark.sql.functions: Extracts json object from a json string based on json path specified, and returns json string of the extracted json object.
getAcceptanceResults(RDD<Tuple2<K, V>>, boolean, Map<K, Object>, Option<Map<K, Object>>, long) - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils: Count the number of items instantly accepted and generate the waitlist for each stratum.
getActive() - Static method in class org.apache.spark.streaming.StreamingContext: :: Experimental ::
getActiveJobIds() - Method in class org.apache.spark.api.java.JavaSparkStatusTracker: Returns an array containing the ids of all active jobs.
getActiveJobIds() - Method in class org.apache.spark.SparkStatusTracker: Returns an array containing the ids of all active jobs.
getActiveOrCreate(Function0<StreamingContext>) - Static method in class org.apache.spark.streaming.StreamingContext: :: Experimental ::
getActiveOrCreate(String, Function0<StreamingContext>, Configuration, boolean) - Static method in class org.apache.spark.streaming.StreamingContext: :: Experimental ::
getActiveSession() - Static method in class org.apache.spark.sql.SparkSession: Returns the active SparkSession for the current thread, returned by the builder.
getActiveStageIds() - Method in class org.apache.spark.api.java.JavaSparkStatusTracker: Returns an array containing the ids of all active stages.
getActiveStageIds() - Method in class org.apache.spark.SparkStatusTracker: Returns an array containing the ids of all active stages.
getAggregationDepth() - Method in interface org.apache.spark.ml.param.shared.HasAggregationDepth
getAlgo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
getAll() - Method in class org.apache.spark.SparkConf: Get all parameters as a list of pairs
getAll() - Method in class org.apache.spark.sql.RuntimeConfig: Returns all properties set in this conf.
getAllConfs() - Method in class org.apache.spark.sql.SQLContext: Return all the configuration properties that have been set (i.e.
getAllPools() - Method in class org.apache.spark.SparkContext: :: DeveloperApi :: Return pools for fair scheduler
getAllPrefLocs(RDD<?>) - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer.PartitionLocations
GetAllReceiverInfo - Class in org.apache.spark.streaming.scheduler
GetAllReceiverInfo() - Constructor for class org.apache.spark.streaming.scheduler.GetAllReceiverInfo
getAllWithPrefix(String) - Method in class org.apache.spark.SparkConf: Get all parameters that start with prefix
getAlpha() - Method in interface org.apache.spark.ml.recommendation.ALSParams
getAlpha() - Method in class org.apache.spark.mllib.clustering.LDA: Alias for getDocConcentration
getAnyValAs(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i.
getAppId() - Method in interface org.apache.spark.launcher.SparkAppHandle: Returns the application ID, or null if not yet known.
getAppId() - Method in class org.apache.spark.SparkConf: Returns the Spark application id, valid in the Driver after TaskScheduler registration and from the start in the Executor.
getApplicationInfo(String) - Method in interface org.apache.spark.status.api.v1.UIRoot
getApplicationInfoList() - Method in interface org.apache.spark.status.api.v1.UIRoot
getArray(int) - Method in class org.apache.spark.sql.vectorized.ArrowColumnVector
getArray(int) - Method in class org.apache.spark.sql.vectorized.ColumnarArray
getArray(int) - Method in class org.apache.spark.sql.vectorized.ColumnarRow
getArray(int) - Method in class org.apache.spark.sql.vectorized.ColumnVector: Returns the array type value for rowId.
getAs(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i.
getAs(String) - Method in interface org.apache.spark.sql.Row: Returns the value of a given fieldName.
getAssociationRulesFromFP(Dataset<?>, String, String, double, Map<T, Object>, ClassTag<T>) - Static method in class org.apache.spark.ml.fpm.AssociationRules: Computes the association rules with confidence above minConfidence.
getAsymmetricAlpha() - Method in class org.apache.spark.mllib.clustering.LDA: Alias for getAsymmetricDocConcentration
getAsymmetricDocConcentration() - Method in class org.apache.spark.mllib.clustering.LDA: Concentration parameter (commonly named "alpha") for the prior placed on documents' distributions over topics ("theta").
getAttr(String) - Method in class org.apache.spark.ml.attribute.AttributeGroup: Gets an attribute by its name.
getAttr(int) - Method in class org.apache.spark.ml.attribute.AttributeGroup: Gets an attribute by its index.
getAvroSchema() - Method in class org.apache.spark.SparkConf: Gets all the avro schemas in the configuration used in the generic Avro record serializer
getBatchingTimeout(SparkConf) - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils: How long we will wait for the wrappedLog in the BatchedWriteAheadLog to write the records before we fail the write attempt to unblock receivers.
getBernoulliSamplingFunction(RDD<Tuple2<K, V>>, Map<K, Object>, boolean, long) - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils: Return the per partition sampling function used for sampling without replacement.
getBeta() - Method in class org.apache.spark.mllib.clustering.LDA: Alias for getTopicConcentration
getBinary() - Method in interface org.apache.spark.ml.feature.CountVectorizerParams
getBinary() - Method in class org.apache.spark.ml.feature.HashingTF
getBinary(int) - Method in class org.apache.spark.sql.vectorized.ArrowColumnVector
getBinary(int) - Method in class org.apache.spark.sql.vectorized.ColumnarArray
getBinary(int) - Method in class org.apache.spark.sql.vectorized.ColumnarRow
getBinary(int) - Method in class org.apache.spark.sql.vectorized.ColumnVector: Returns the binary type value for rowId.
getBinaryWritable(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
getBinaryWritableConstantObjectInspector(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
getBlockSize() - Method in interface org.apache.spark.ml.classification.MultilayerPerceptronParams
GetBlockStatus(BlockId, boolean) - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetBlockStatus
GetBlockStatus$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetBlockStatus$
getBoolean(String, boolean) - Method in class org.apache.spark.SparkConf: Get a parameter as a boolean, falling back to a default if not set
getBoolean(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i as a primitive boolean.
getBoolean(String, boolean) - Method in class org.apache.spark.sql.sources.v2.DataSourceOptions: Returns the boolean value to which the specified key is mapped, or defaultValue if there is no mapping for the key.
getBoolean(String) - Method in class org.apache.spark.sql.types.Metadata: Gets a Boolean.
getBoolean(int) - Method in class org.apache.spark.sql.vectorized.ArrowColumnVector
getBoolean(int) - Method in class org.apache.spark.sql.vectorized.ColumnarArray
getBoolean(int) - Method in class org.apache.spark.sql.vectorized.ColumnarRow
getBoolean(int) - Method in class org.apache.spark.sql.vectorized.ColumnVector: Returns the boolean type value for rowId.
getBooleanArray(String) - Method in class org.apache.spark.sql.types.Metadata: Gets a Boolean array.
getBooleans(int, int) - Method in class org.apache.spark.sql.vectorized.ColumnVector: Gets boolean type values from [rowId, rowId + count).
getBooleanWritable(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
getBooleanWritableConstantObjectInspector(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
getBucketLength() - Method in interface org.apache.spark.ml.feature.BucketedRandomProjectionLSHParams
getBuilder() - Method in class org.apache.spark.storage.memory.DeserializedValuesHolder
getBuilder() - Method in class org.apache.spark.storage.memory.SerializedValuesHolder
getBuilder() - Method in interface org.apache.spark.storage.memory.ValuesHolder: Note: After this method is called, the ValuesHolder is invalid, we can't store data and get estimate size again.
getByte(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i as a primitive byte.
getByte(int) - Method in class org.apache.spark.sql.vectorized.ArrowColumnVector
getByte(int) - Method in class org.apache.spark.sql.vectorized.ColumnarArray
getByte(int) - Method in class org.apache.spark.sql.vectorized.ColumnarRow
getByte(int) - Method in class org.apache.spark.sql.vectorized.ColumnVector: Returns the byte type value for rowId.
getBytes(int, int) - Method in class org.apache.spark.sql.vectorized.ColumnVector: Gets byte type values from [rowId, rowId + count).
getByteWritable(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
getByteWritableConstantObjectInspector(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
getCachedBlockManagerId(BlockManagerId) - Static method in class org.apache.spark.storage.BlockManagerId
getCachedMetadata(String) - Static method in class org.apache.spark.rdd.HadoopRDD: The three methods below are helpers for accessing the local map, a property of the SparkEnv of the local process.
getCacheNodeIds() - Method in interface org.apache.spark.ml.tree.DecisionTreeParams
getCallSite(Function1<String, Object>) - Static method in class org.apache.spark.util.Utils: When called inside a class in the spark package, returns the name of the user code class (outside the spark package) that called into Spark, as well as which Spark method they called.
getCaseSensitive() - Method in class org.apache.spark.ml.feature.StopWordsRemover
getCatalystType(int, String, int, MetadataBuilder) - Method in class org.apache.spark.sql.jdbc.AggregatedDialect
getCatalystType(int, String, int, MetadataBuilder) - Static method in class org.apache.spark.sql.jdbc.DB2Dialect
getCatalystType(int, String, int, MetadataBuilder) - Static method in class org.apache.spark.sql.jdbc.DerbyDialect
getCatalystType(int, String, int, MetadataBuilder) - Method in class org.apache.spark.sql.jdbc.JdbcDialect: Get the custom datatype mapping for the given jdbc meta information.
getCatalystType(int, String, int, MetadataBuilder) - Static method in class org.apache.spark.sql.jdbc.MsSqlServerDialect
getCatalystType(int, String, int, MetadataBuilder) - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
getCatalystType(int, String, int, MetadataBuilder) - Static method in class org.apache.spark.sql.jdbc.NoopDialect
getCatalystType(int, String, int, MetadataBuilder) - Static method in class org.apache.spark.sql.jdbc.OracleDialect
getCatalystType(int, String, int, MetadataBuilder) - Static method in class org.apache.spark.sql.jdbc.PostgresDialect
getCatalystType(int, String, int, MetadataBuilder) - Static method in class org.apache.spark.sql.jdbc.TeradataDialect
getCategoricalCols() - Method in class org.apache.spark.ml.feature.FeatureHasher
getCategoricalFeatures(StructField) - Static method in class org.apache.spark.ml.util.MetadataUtils: Examine a schema to identify categorical (Binary and Nominal) features.
getCategoricalFeaturesInfo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
getCensorCol() - Method in interface org.apache.spark.ml.regression.AFTSurvivalRegressionParams
getCheckpointDir() - Method in class org.apache.spark.api.java.JavaSparkContext
getCheckpointDir() - Method in class org.apache.spark.SparkContext
getCheckpointFile() - Method in interface org.apache.spark.api.java.JavaRDDLike: Gets the name of the file to which this RDD was checkpointed
getCheckpointFile() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
getCheckpointFile() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
getCheckpointFile() - Method in class org.apache.spark.rdd.RDD: Gets the name of the directory to which this RDD was checkpointed.
getCheckpointFiles() - Method in class org.apache.spark.graphx.Graph: Gets the name of the files to which this Graph was checkpointed.
getCheckpointFiles() - Method in class org.apache.spark.graphx.impl.GraphImpl
getCheckpointFiles() - Method in class org.apache.spark.ml.clustering.DistributedLDAModel: :: DeveloperApi ::
getCheckpointInterval() - Method in interface org.apache.spark.ml.param.shared.HasCheckpointInterval
getCheckpointInterval() - Method in class org.apache.spark.mllib.clustering.LDA: Period (in iterations) between checkpoints.
getCheckpointInterval() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
getChild(int) - Method in class org.apache.spark.sql.vectorized.ArrowColumnVector
getClassifier() - Method in interface org.apache.spark.ml.classification.OneVsRestParams
getColdStartStrategy() - Method in interface org.apache.spark.ml.recommendation.ALSModelParams
getCollectSubModels() - Method in interface org.apache.spark.ml.param.shared.HasCollectSubModels
getCombOp() - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils: Returns the function used combine results returned by seqOp from different partitions.
getComment() - Method in class org.apache.spark.sql.types.StructField: Return the comment of this StructField.
getConf() - Method in class org.apache.spark.api.java.JavaSparkContext: Return a copy of this JavaSparkContext's configuration.
getConf() - Method in interface org.apache.spark.input.Configurable
getConf() - Method in class org.apache.spark.rdd.HadoopRDD
getConf() - Method in class org.apache.spark.rdd.NewHadoopRDD
getConf() - Method in class org.apache.spark.SparkContext: Return a copy of this SparkContext's configuration.
getConf(String, String) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Returns the configuration for the given key in the current session.
getConf(String) - Method in class org.apache.spark.sql.SQLContext: Return the value of Spark SQL configuration property for the given key.
getConf(String, String) - Method in class org.apache.spark.sql.SQLContext: Return the value of Spark SQL configuration property for the given key.
getConfiguration() - Method in class org.apache.spark.input.PortableDataStream
getConfiguredLocalDirs(SparkConf) - Static method in class org.apache.spark.util.Utils: Return the configured local directories where Spark can write files.
getConnection() - Method in interface org.apache.spark.rdd.JdbcRDD.ConnectionFactory
getContextOrSparkClassLoader() - Static method in class org.apache.spark.util.Utils: Get the Context ClassLoader on this thread or, if not present, the ClassLoader that loaded Spark.
getConvergenceTol() - Method in class org.apache.spark.mllib.clustering.GaussianMixture: Return the largest change in log-likelihood at which convergence is considered to have occurred.
getCorrelationFromName(String) - Static method in class org.apache.spark.mllib.stat.correlation.Correlations
getCount() - Method in class org.apache.spark.storage.CountingWritableChannel
getCurrentProcessingTimeMs() - Method in interface org.apache.spark.sql.streaming.GroupState: Get the current processing time as milliseconds in epoch time.
getCurrentUserGroups(SparkConf, String) - Static method in class org.apache.spark.util.Utils
getCurrentUserName() - Static method in class org.apache.spark.util.Utils: Returns the current user name.
getCurrentWatermarkMs() - Method in interface org.apache.spark.sql.streaming.GroupState: Get the current event time watermark as milliseconds in epoch time.
getData(Row) - Static method in class org.apache.spark.ml.image.ImageSchema: Gets the image data
getDatabase(String) - Method in class org.apache.spark.sql.catalog.Catalog: Get the database with the specified name.
getDatabase(String) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Returns the metadata for specified database, throwing an exception if it doesn't exist
getDate(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i of date type as java.sql.Date.
getDateWritable(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
getDateWritableConstantObjectInspector(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
getDecimal(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i of decimal type as java.math.BigDecimal.
getDecimal(int, int, int) - Method in class org.apache.spark.sql.vectorized.ArrowColumnVector
getDecimal(int, int, int) - Method in class org.apache.spark.sql.vectorized.ColumnarArray
getDecimal(int, int, int) - Method in class org.apache.spark.sql.vectorized.ColumnarRow
getDecimal(int, int, int) - Method in class org.apache.spark.sql.vectorized.ColumnVector: Returns the decimal type value for rowId.
getDecimalWritable(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
getDecimalWritableConstantObjectInspector(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
getDefault(Param<T>) - Method in interface org.apache.spark.ml.param.Params: Gets the default value of a parameter.
getDefaultPropertiesFile(Map<String, String>) - Static method in class org.apache.spark.util.Utils: Return the path of the default Spark properties file.
getDefaultSession() - Static method in class org.apache.spark.sql.SparkSession: Returns the default SparkSession that is returned by the builder.
getDegree() - Method in class org.apache.spark.ml.feature.PolynomialExpansion
getDenseSizeInBytes() - Method in interface org.apache.spark.ml.linalg.Matrix: Gets the size of the dense representation of this `Matrix`.
getDependencies() - Method in class org.apache.spark.rdd.CoGroupedRDD
getDependencies() - Method in class org.apache.spark.rdd.ShuffledRDD
getDependencies() - Method in class org.apache.spark.rdd.UnionRDD
getDeprecatedConfig(String, Map<String, String>) - Static method in class org.apache.spark.SparkConf: Looks for available deprecated keys for the given config option, and return the first value available.
getDistanceMeasure() - Method in class org.apache.spark.ml.evaluation.ClusteringEvaluator
getDistanceMeasure() - Method in interface org.apache.spark.ml.param.shared.HasDistanceMeasure
getDistanceMeasure() - Method in class org.apache.spark.mllib.clustering.BisectingKMeans: The distance suite used by the algorithm.
getDistanceMeasure() - Method in class org.apache.spark.mllib.clustering.KMeans: The distance suite used by the algorithm.
getDistributions() - Method in class org.apache.spark.status.LiveRDD
getDocConcentration() - Method in interface org.apache.spark.ml.clustering.LDAParams
getDocConcentration() - Method in class org.apache.spark.mllib.clustering.LDA: Concentration parameter (commonly named "alpha") for the prior placed on documents' distributions over topics ("theta").
getDouble(String, double) - Method in class org.apache.spark.SparkConf: Get a parameter as a double, falling back to a default if not ste
getDouble(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i as a primitive double.
getDouble(String, double) - Method in class org.apache.spark.sql.sources.v2.DataSourceOptions: Returns the double value to which the specified key is mapped, or defaultValue if there is no mapping for the key.
getDouble(String) - Method in class org.apache.spark.sql.types.Metadata: Gets a Double.
getDouble(int) - Method in class org.apache.spark.sql.vectorized.ArrowColumnVector
getDouble(int) - Method in class org.apache.spark.sql.vectorized.ColumnarArray
getDouble(int) - Method in class org.apache.spark.sql.vectorized.ColumnarRow
getDouble(int) - Method in class org.apache.spark.sql.vectorized.ColumnVector: Returns the double type value for rowId.
getDoubleArray(String) - Method in class org.apache.spark.sql.types.Metadata: Gets a Double array.
getDoubles(int, int) - Method in class org.apache.spark.sql.vectorized.ColumnVector: Gets double type values from [rowId, rowId + count).
getDoubleWritable(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
getDoubleWritableConstantObjectInspector(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
getDriverLogUrls() - Method in interface org.apache.spark.scheduler.SchedulerBackend: Get the URLs for the driver logs.
getDropLast() - Method in class org.apache.spark.ml.feature.OneHotEncoder: Deprecated.
getDropLast() - Method in interface org.apache.spark.ml.feature.OneHotEncoderBase
getDstCol() - Method in interface org.apache.spark.ml.clustering.PowerIterationClusteringParams
getDynamicAllocationInitialExecutors(SparkConf) - Static method in class org.apache.spark.util.Utils: Return the initial number of executors for dynamic allocation.
getElasticNetParam() - Method in interface org.apache.spark.ml.param.shared.HasElasticNetParam
getEncryptionEnabled(JavaSparkContext) - Static method in class org.apache.spark.api.r.RUtils
getEndOffset() - Method in interface org.apache.spark.sql.sources.v2.reader.streaming.MicroBatchReader: Return the specified (if explicitly set through setOffsetRange) or inferred end offset for this reader.
getEndTimeEpoch() - Method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
getEpsilon() - Method in interface org.apache.spark.ml.regression.LinearRegressionParams
getEpsilon() - Method in class org.apache.spark.mllib.clustering.KMeans: The distance threshold within which we've consider centers to have converged.
getEstimator() - Method in interface org.apache.spark.ml.tuning.ValidatorParams
getEstimatorParamMaps() - Method in interface org.apache.spark.ml.tuning.ValidatorParams
getEvaluator() - Method in interface org.apache.spark.ml.tuning.ValidatorParams
getExecutionContext() - Method in interface org.apache.spark.ml.param.shared.HasParallelism: Create a new execution context with a thread-pool that has a maximum number of threads set to the value of parallelism.
GetExecutorEndpointRef(String) - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetExecutorEndpointRef
GetExecutorEndpointRef$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetExecutorEndpointRef$
getExecutorEnv() - Method in class org.apache.spark.SparkConf: Get all executor environment variables set on this SparkConf
getExecutorIds() - Method in interface org.apache.spark.ExecutorAllocationClient: Get the list of currently active executors
getExecutorInfos() - Method in class org.apache.spark.SparkStatusTracker: Returns information of all known executors, including host, port, cacheSize, numRunningTasks and memory metrics.
GetExecutorLossReason(String) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.GetExecutorLossReason
GetExecutorLossReason$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.GetExecutorLossReason$
getExecutorMemoryStatus() - Method in class org.apache.spark.SparkContext: Return a map from the slave to the max memory available for caching and the remaining memory available for caching.
getExternalScratchDir(URI, Configuration, String) - Method in interface org.apache.spark.sql.hive.execution.SaveAsHiveFile
getExternalTmpPath(SparkSession, Configuration, Path) - Method in interface org.apache.spark.sql.hive.execution.SaveAsHiveFile
getExtTmpPathRelTo(Path, Configuration, String) - Method in interface org.apache.spark.sql.hive.execution.SaveAsHiveFile
getFamily() - Method in interface org.apache.spark.ml.classification.LogisticRegressionParams
getFamily() - Method in interface org.apache.spark.ml.regression.GeneralizedLinearRegressionBase
getFdr() - Method in interface org.apache.spark.ml.feature.ChiSqSelectorParams
getFeatureIndex() - Method in interface org.apache.spark.ml.regression.IsotonicRegressionBase
getFeatureIndicesFromNames(StructField, String[]) - Static method in class org.apache.spark.ml.util.MetadataUtils: Takes a Vector column and a list of feature names, and returns the corresponding list of feature indices in the column, in order.
getFeaturesAndLabels(RFormulaModel, Dataset<?>) - Static method in class org.apache.spark.ml.r.RWrapperUtils: Get the feature names and original labels from the schema of DataFrame transformed by RFormulaModel.
getFeaturesCol() - Method in interface org.apache.spark.ml.param.shared.HasFeaturesCol
getFeatureSubsetStrategy() - Method in interface org.apache.spark.ml.tree.TreeEnsembleParams
getField(String) - Method in class org.apache.spark.sql.Column: An expression that gets a field by name in a StructType.
getFileLength(File, SparkConf) - Static method in class org.apache.spark.util.Utils: Return the file length, if the file is compressed it returns the uncompressed file length.
getFileReader(String, Option<Configuration>, boolean) - Static method in class org.apache.spark.sql.hive.orc.OrcFileOperator: Retrieves an ORC file reader from a given path.
getFileSegmentLocations(String, long, long, Configuration) - Static method in class org.apache.spark.streaming.util.HdfsUtils: Get the locations of the HDFS blocks containing the given file segment.
getFileSystemForPath(Path, Configuration) - Static method in class org.apache.spark.streaming.util.HdfsUtils
getFinalStorageLevel() - Method in interface org.apache.spark.ml.recommendation.ALSParams
getFinalValue() - Method in class org.apache.spark.partial.PartialResult: Blocking method to wait for and return the final value.
getFitIntercept() - Method in interface org.apache.spark.ml.param.shared.HasFitIntercept
getFloat(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i as a primitive float.
getFloat(int) - Method in class org.apache.spark.sql.vectorized.ArrowColumnVector
getFloat(int) - Method in class org.apache.spark.sql.vectorized.ColumnarArray
getFloat(int) - Method in class org.apache.spark.sql.vectorized.ColumnarRow
getFloat(int) - Method in class org.apache.spark.sql.vectorized.ColumnVector: Returns the float type value for rowId.
getFloats(int, int) - Method in class org.apache.spark.sql.vectorized.ColumnVector: Gets float type values from [rowId, rowId + count).
getFloatWritable(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
getFloatWritableConstantObjectInspector(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
getForceIndexLabel() - Method in interface org.apache.spark.ml.feature.RFormulaBase
getFormattedClassName(Object) - Static method in class org.apache.spark.util.Utils: Return the class name of the given object, removing all dollar signs
getFormula() - Method in interface org.apache.spark.ml.feature.RFormulaBase
getFpr() - Method in interface org.apache.spark.ml.feature.ChiSqSelectorParams
getFunction(String) - Method in class org.apache.spark.sql.catalog.Catalog: Get the function with the specified name.
getFunction(String, String) - Method in class org.apache.spark.sql.catalog.Catalog: Get the function with the specified name.
getFunction(String, String) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Return an existing function in the database, assuming it exists.
getFunctionOption(String, String) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Return an existing function in the database, or None if it doesn't exist.
getFwe() - Method in interface org.apache.spark.ml.feature.ChiSqSelectorParams
getGaps() - Method in class org.apache.spark.ml.feature.RegexTokenizer
getGroups(String) - Method in interface org.apache.spark.security.GroupMappingServiceProvider: Get the groups the user belongs to.
getHadoopFileSystem(URI, Configuration) - Static method in class org.apache.spark.util.Utils: Return a Hadoop FileSystem with the scheme encoded in the given path.
getHadoopFileSystem(String, Configuration) - Static method in class org.apache.spark.util.Utils: Return a Hadoop FileSystem with the scheme encoded in the given path.
getHandleInvalid() - Method in interface org.apache.spark.ml.param.shared.HasHandleInvalid
getHeight(Row) - Static method in class org.apache.spark.ml.image.ImageSchema: Gets the height of the image
getHiveWriteCompression(TableDesc, SQLConf) - Static method in class org.apache.spark.sql.hive.execution.HiveOptions
getImplicitPrefs() - Method in interface org.apache.spark.ml.recommendation.ALSParams
getImpurity() - Method in interface org.apache.spark.ml.tree.TreeClassifierParams
getImpurity() - Method in interface org.apache.spark.ml.tree.TreeRegressorParams
getImpurity() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
getIndices() - Method in class org.apache.spark.ml.feature.VectorSlicer
getInitializationMode() - Method in class org.apache.spark.mllib.clustering.KMeans: The initialization algorithm.
getInitializationSteps() - Method in class org.apache.spark.mllib.clustering.KMeans: Number of steps for the k-means|| initialization mode
getInitialModel() - Method in class org.apache.spark.mllib.clustering.GaussianMixture: Return the user supplied initial GMM, if supplied
getInitialPositionInStream(int) - Method in class org.apache.spark.streaming.kinesis.KinesisUtilsPythonHelper
getInitialTargetExecutorNumber(SparkConf, int) - Static method in class org.apache.spark.scheduler.cluster.SchedulerBackendUtils: Getting the initial target number of executors depends on whether dynamic allocation is enabled.
getInitialWeights() - Method in interface org.apache.spark.ml.classification.MultilayerPerceptronParams
getInitMode() - Method in interface org.apache.spark.ml.clustering.KMeansParams
getInitMode() - Method in interface org.apache.spark.ml.clustering.PowerIterationClusteringParams
getInitSteps() - Method in interface org.apache.spark.ml.clustering.KMeansParams
getInputCol() - Method in interface org.apache.spark.ml.param.shared.HasInputCol
getInputCols() - Method in interface org.apache.spark.ml.param.shared.HasInputCols
getInputFilePath() - Static method in class org.apache.spark.rdd.InputFileBlockHolder: Returns the holding file name or empty string if it is unknown.
getInputStream(String, Configuration) - Static method in class org.apache.spark.streaming.util.HdfsUtils
getInt(String, int) - Method in class org.apache.spark.SparkConf: Get a parameter as an integer, falling back to a default if not set
getInt(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i as a primitive int.
getInt(String, int) - Method in class org.apache.spark.sql.sources.v2.DataSourceOptions: Returns the integer value to which the specified key is mapped, or defaultValue if there is no mapping for the key.
getInt(int) - Method in class org.apache.spark.sql.vectorized.ArrowColumnVector
getInt(int) - Method in class org.apache.spark.sql.vectorized.ColumnarArray
getInt(int) - Method in class org.apache.spark.sql.vectorized.ColumnarRow
getInt(int) - Method in class org.apache.spark.sql.vectorized.ColumnVector: Returns the int type value for rowId.
getIntermediateStorageLevel() - Method in interface org.apache.spark.ml.recommendation.ALSParams
getInterval(int) - Method in class org.apache.spark.sql.vectorized.ColumnarArray
getInterval(int) - Method in class org.apache.spark.sql.vectorized.ColumnarRow
getInterval(int) - Method in class org.apache.spark.sql.vectorized.ColumnVector: Returns the calendar interval type value for rowId.
getInts(int, int) - Method in class org.apache.spark.sql.vectorized.ColumnVector: Gets int type values from [rowId, rowId + count).
getIntWritable(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
getIntWritableConstantObjectInspector(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
getInverse() - Method in class org.apache.spark.ml.feature.DCT
getIsotonic() - Method in interface org.apache.spark.ml.regression.IsotonicRegressionBase
getItem(Object) - Method in class org.apache.spark.sql.Column: An expression that gets an item at position ordinal out of an array, or gets a value by key key in a MapType.
getItemCol() - Method in interface org.apache.spark.ml.recommendation.ALSModelParams
getItemsCol() - Method in interface org.apache.spark.ml.fpm.FPGrowthParams
getIteratorSize(Iterator<?>) - Static method in class org.apache.spark.util.Utils: Counts the number of elements of an iterator using a while loop rather than calling TraversableOnce.size() because it uses a for loop, which is slightly slower in the current version of Scala.
getIteratorZipWithIndex(Iterator<T>, long) - Static method in class org.apache.spark.util.Utils: Generate a zipWithIndex iterator, avoid index value overflowing problem in scala's zipWithIndex
getJavaMap(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i of array type as a java.util.Map.
getJavaSparkContext(SparkSession) - Static method in class org.apache.spark.sql.api.r.SQLUtils
getJDBCType(DataType) - Method in class org.apache.spark.sql.jdbc.AggregatedDialect
getJDBCType(DataType) - Static method in class org.apache.spark.sql.jdbc.DB2Dialect
getJDBCType(DataType) - Static method in class org.apache.spark.sql.jdbc.DerbyDialect
getJDBCType(DataType) - Method in class org.apache.spark.sql.jdbc.JdbcDialect: Retrieve the jdbc / sql type for a given datatype.
getJDBCType(DataType) - Static method in class org.apache.spark.sql.jdbc.MsSqlServerDialect
getJDBCType(DataType) - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
getJDBCType(DataType) - Static method in class org.apache.spark.sql.jdbc.NoopDialect
getJDBCType(DataType) - Static method in class org.apache.spark.sql.jdbc.OracleDialect
getJDBCType(DataType) - Static method in class org.apache.spark.sql.jdbc.PostgresDialect
getJDBCType(DataType) - Static method in class org.apache.spark.sql.jdbc.TeradataDialect
getJobIdsForGroup(String) - Method in class org.apache.spark.api.java.JavaSparkStatusTracker: Return a list of all known jobs in a particular job group.
getJobIdsForGroup(String) - Method in class org.apache.spark.SparkStatusTracker: Return a list of all known jobs in a particular job group.
getJobInfo(int) - Method in class org.apache.spark.api.java.JavaSparkStatusTracker: Returns job information, or null if the job info could not be found or was garbage collected.
getJobInfo(int) - Method in class org.apache.spark.SparkStatusTracker: Returns job information, or None if the job info could not be found or was garbage collected.
getK() - Method in interface org.apache.spark.ml.clustering.BisectingKMeansParams
getK() - Method in interface org.apache.spark.ml.clustering.GaussianMixtureParams
getK() - Method in interface org.apache.spark.ml.clustering.KMeansParams
getK() - Method in interface org.apache.spark.ml.clustering.LDAParams
getK() - Method in interface org.apache.spark.ml.clustering.PowerIterationClusteringParams
getK() - Method in interface org.apache.spark.ml.feature.PCAParams
getK() - Method in class org.apache.spark.mllib.clustering.BisectingKMeans: Gets the desired number of leaf clusters.
getK() - Method in class org.apache.spark.mllib.clustering.GaussianMixture: Return the number of Gaussians in the mixture model
getK() - Method in class org.apache.spark.mllib.clustering.KMeans: Number of clusters to create (k).
getK() - Method in class org.apache.spark.mllib.clustering.LDA: Number of topics to infer, i.e., the number of soft cluster centers.
getKappa() - Method in class org.apache.spark.mllib.clustering.OnlineLDAOptimizer: Learning rate: exponential decay rate
getKeepLastCheckpoint() - Method in interface org.apache.spark.ml.clustering.LDAParams
getKeepLastCheckpoint() - Method in class org.apache.spark.mllib.clustering.EMLDAOptimizer: If using checkpointing, this indicates whether to keep the last checkpoint (vs clean up).
getLabelCol() - Method in interface org.apache.spark.ml.param.shared.HasLabelCol
getLabels() - Method in class org.apache.spark.ml.feature.IndexToString
getLambda() - Method in class org.apache.spark.mllib.classification.NaiveBayes: Get the smoothing parameter.
getLastUpdatedEpoch() - Method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
getLayers() - Method in interface org.apache.spark.ml.classification.MultilayerPerceptronParams
getLDAModel(double[]) - Method in interface org.apache.spark.mllib.clustering.LDAOptimizer
getLearningDecay() - Method in interface org.apache.spark.ml.clustering.LDAParams
getLearningOffset() - Method in interface org.apache.spark.ml.clustering.LDAParams
getLearningRate() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
getLeastGroupHash(String) - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer: Sorts and gets the least element of the list associated with key in groupHash The returned PartitionGroup is the least loaded of all groups that represent the machine "key"
getLength() - Static method in class org.apache.spark.rdd.InputFileBlockHolder: Returns the length of the block being read, or -1 if it is unknown.
getLink() - Method in interface org.apache.spark.ml.regression.GeneralizedLinearRegressionBase
getLinkPower() - Method in interface org.apache.spark.ml.regression.GeneralizedLinearRegressionBase
getLinkPredictionCol() - Method in interface org.apache.spark.ml.regression.GeneralizedLinearRegressionBase
getList(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i of array type as java.util.List.
getLocalDir(SparkConf) - Static method in class org.apache.spark.util.Utils: Get the path of a temporary directory.
getLocale() - Method in class org.apache.spark.ml.feature.StopWordsRemover
getLocalProperty(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Get a local property set in this thread, or null if it is missing.
getLocalProperty(String) - Method in class org.apache.spark.BarrierTaskContext
getLocalProperty(String) - Method in class org.apache.spark.SparkContext: Get a local property set in this thread, or null if it is missing.
getLocalProperty(String) - Method in class org.apache.spark.TaskContext: Get a local property set upstream in the driver, or null if it is missing.
getLocalUserJarsForShell(SparkConf) - Static method in class org.apache.spark.util.Utils: Return the local jar files which will be added to REPL's classpath.
GetLocations(BlockId) - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetLocations
GetLocations$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetLocations$
GetLocationsAndStatus(BlockId) - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetLocationsAndStatus
GetLocationsAndStatus$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetLocationsAndStatus$
GetLocationsMultipleBlockIds(BlockId[]) - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetLocationsMultipleBlockIds
GetLocationsMultipleBlockIds$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetLocationsMultipleBlockIds$
getLong(String, long) - Method in class org.apache.spark.SparkConf: Get a parameter as a long, falling back to a default if not set
getLong(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i as a primitive long.
getLong(String, long) - Method in class org.apache.spark.sql.sources.v2.DataSourceOptions: Returns the long value to which the specified key is mapped, or defaultValue if there is no mapping for the key.
getLong(String) - Method in class org.apache.spark.sql.types.Metadata: Gets a Long.
getLong(int) - Method in class org.apache.spark.sql.vectorized.ArrowColumnVector
getLong(int) - Method in class org.apache.spark.sql.vectorized.ColumnarArray
getLong(int) - Method in class org.apache.spark.sql.vectorized.ColumnarRow
getLong(int) - Method in class org.apache.spark.sql.vectorized.ColumnVector: Returns the long type value for rowId.
getLongArray(String) - Method in class org.apache.spark.sql.types.Metadata: Gets a Long array.
getLongs(int, int) - Method in class org.apache.spark.sql.vectorized.ColumnVector: Gets long type values from [rowId, rowId + count).
getLongWritable(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
getLongWritableConstantObjectInspector(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
getLoss() - Method in interface org.apache.spark.ml.param.shared.HasLoss
getLoss() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
getLossType() - Method in interface org.apache.spark.ml.tree.GBTClassifierParams
getLossType() - Method in interface org.apache.spark.ml.tree.GBTRegressorParams
getLowerBound(double, long, double) - Static method in class org.apache.spark.util.random.BinomialBounds: Returns a threshold p such that if we conduct n Bernoulli trials with success rate = p, it is very unlikely to have more than fraction * n successes.
getLowerBound(double) - Static method in class org.apache.spark.util.random.PoissonBounds: Returns a lambda such that Pr[X > s] is very small, where X ~ Pois(lambda).
getLowerBoundsOnCoefficients() - Method in interface org.apache.spark.ml.classification.LogisticRegressionParams
getLowerBoundsOnIntercepts() - Method in interface org.apache.spark.ml.classification.LogisticRegressionParams
getMap(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i of map type as a Scala Map.
getMap(int) - Method in class org.apache.spark.sql.vectorized.ArrowColumnVector
getMap(int) - Method in class org.apache.spark.sql.vectorized.ColumnarArray
getMap(int) - Method in class org.apache.spark.sql.vectorized.ColumnarRow
getMap(int) - Method in class org.apache.spark.sql.vectorized.ColumnVector: Returns the map type value for rowId.
GetMatchingBlockIds(Function1<BlockId, Object>, boolean) - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetMatchingBlockIds
GetMatchingBlockIds$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetMatchingBlockIds$
getMax() - Method in interface org.apache.spark.ml.feature.MinMaxScalerParams
getMaxBins() - Method in interface org.apache.spark.ml.tree.DecisionTreeParams
getMaxBins() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
getMaxCategories() - Method in interface org.apache.spark.ml.feature.VectorIndexerParams
getMaxDepth() - Method in interface org.apache.spark.ml.tree.DecisionTreeParams
getMaxDepth() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
getMaxDF() - Method in interface org.apache.spark.ml.feature.CountVectorizerParams
getMaxFailures(SparkConf, boolean) - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
getMaxIter() - Method in interface org.apache.spark.ml.param.shared.HasMaxIter
getMaxIterations() - Method in class org.apache.spark.mllib.clustering.BisectingKMeans: Gets the max number of k-means iterations to split clusters.
getMaxIterations() - Method in class org.apache.spark.mllib.clustering.GaussianMixture: Return the maximum number of iterations allowed
getMaxIterations() - Method in class org.apache.spark.mllib.clustering.KMeans: Maximum number of iterations allowed.
getMaxIterations() - Method in class org.apache.spark.mllib.clustering.LDA: Maximum number of iterations allowed.
getMaxLocalProjDBSize() - Method in class org.apache.spark.ml.fpm.PrefixSpan
getMaxLocalProjDBSize() - Method in class org.apache.spark.mllib.fpm.PrefixSpan: Gets the maximum number of items allowed in a projected database before local processing.
getMaxMemoryInMB() - Method in interface org.apache.spark.ml.tree.DecisionTreeParams
getMaxMemoryInMB() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
getMaxPatternLength() - Method in class org.apache.spark.ml.fpm.PrefixSpan
getMaxPatternLength() - Method in class org.apache.spark.mllib.fpm.PrefixSpan: Gets the maximal pattern length (i.e.
getMaxSentenceLength() - Method in interface org.apache.spark.ml.feature.Word2VecBase
GetMemoryStatus$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetMemoryStatus$
getMessage() - Method in exception org.apache.spark.sql.AnalysisException
getMetadata(String) - Method in class org.apache.spark.sql.types.Metadata: Gets a Metadata.
getMetadataArray(String) - Method in class org.apache.spark.sql.types.Metadata: Gets a Metadata array.
getMetricName() - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
getMetricName() - Method in class org.apache.spark.ml.evaluation.ClusteringEvaluator
getMetricName() - Method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
getMetricName() - Method in class org.apache.spark.ml.evaluation.RegressionEvaluator
getMetricsSources(String) - Method in class org.apache.spark.BarrierTaskContext
getMetricsSources(String) - Method in class org.apache.spark.TaskContext: ::DeveloperApi:: Returns all metrics sources with the given name which are associated with the instance which runs the task.
getMin() - Method in interface org.apache.spark.ml.feature.MinMaxScalerParams
getMinConfidence() - Method in interface org.apache.spark.ml.fpm.FPGrowthParams
getMinCount() - Method in interface org.apache.spark.ml.feature.Word2VecBase
getMinDF() - Method in interface org.apache.spark.ml.feature.CountVectorizerParams
getMinDivisibleClusterSize() - Method in interface org.apache.spark.ml.clustering.BisectingKMeansParams
getMinDivisibleClusterSize() - Method in class org.apache.spark.mllib.clustering.BisectingKMeans: Gets the minimum number of points (if greater than or equal to 1.0) or the minimum proportion of points (if less than 1.0) of a divisible cluster.
getMinDocFreq() - Method in interface org.apache.spark.ml.feature.IDFBase
getMiniBatchFraction() - Method in class org.apache.spark.mllib.clustering.OnlineLDAOptimizer: Mini-batch fraction, which sets the fraction of document sampled and used in each iteration
getMinInfoGain() - Method in interface org.apache.spark.ml.tree.DecisionTreeParams
getMinInfoGain() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
getMinInstancesPerNode() - Method in interface org.apache.spark.ml.tree.DecisionTreeParams
getMinInstancesPerNode() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
getMinSupport() - Method in interface org.apache.spark.ml.fpm.FPGrowthParams
getMinSupport() - Method in class org.apache.spark.ml.fpm.PrefixSpan
getMinSupport() - Method in class org.apache.spark.mllib.fpm.PrefixSpan: Get the minimal support (i.e.
getMinTF() - Method in interface org.apache.spark.ml.feature.CountVectorizerParams
getMinTokenLength() - Method in class org.apache.spark.ml.feature.RegexTokenizer
getMissingValue() - Method in interface org.apache.spark.ml.feature.ImputerParams
getMode(Row) - Static method in class org.apache.spark.ml.image.ImageSchema: Gets the OpenCV representation as an int
getModelType() - Method in interface org.apache.spark.ml.classification.NaiveBayesParams
getModelType() - Method in class org.apache.spark.mllib.classification.NaiveBayes: Get the model type.
getN() - Method in class org.apache.spark.ml.feature.NGram
getNames() - Method in class org.apache.spark.ml.feature.VectorSlicer
getNChannels(Row) - Static method in class org.apache.spark.ml.image.ImageSchema: Gets the number of channels in the image
getNode(int, Node) - Static method in class org.apache.spark.mllib.tree.model.Node: Traces down from a root node to get the node with the given node index.
getNonnegative() - Method in interface org.apache.spark.ml.recommendation.ALSParams
getNumBuckets() - Method in interface org.apache.spark.ml.feature.QuantileDiscretizerBase
getNumBucketsArray() - Method in interface org.apache.spark.ml.feature.QuantileDiscretizerBase
getNumClasses(StructField) - Static method in class org.apache.spark.ml.util.MetadataUtils: Examine a schema to identify the number of classes in a label column.
getNumClasses() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
getNumFeatures() - Method in class org.apache.spark.ml.feature.FeatureHasher
getNumFeatures() - Method in class org.apache.spark.ml.feature.HashingTF
getNumFeatures() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm: The dimension of training features.
getNumFolds() - Method in interface org.apache.spark.ml.tuning.CrossValidatorParams
getNumHashTables() - Method in interface org.apache.spark.ml.feature.LSHParams
getNumItemBlocks() - Method in interface org.apache.spark.ml.recommendation.ALSParams
getNumIterations() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
getNumObjFields() - Method in class org.apache.spark.serializer.SerializationDebugger.ObjectStreamClassMethods
getNumPartitions() - Method in interface org.apache.spark.api.java.JavaRDDLike: Return the number of partitions in this RDD.
getNumPartitions() - Method in interface org.apache.spark.ml.feature.Word2VecBase
getNumPartitions() - Method in interface org.apache.spark.ml.fpm.FPGrowthParams
getNumPartitions() - Method in class org.apache.spark.rdd.RDD: Returns the number of partitions of this RDD.
getNumTopFeatures() - Method in interface org.apache.spark.ml.feature.ChiSqSelectorParams
getNumTrees() - Method in class org.apache.spark.ml.classification.GBTClassificationModel: Number of trees in ensemble
getNumTrees() - Method in class org.apache.spark.ml.regression.GBTRegressionModel: Number of trees in ensemble
getNumTrees() - Method in interface org.apache.spark.ml.tree.RandomForestParams
getNumUserBlocks() - Method in interface org.apache.spark.ml.recommendation.ALSParams
getNumValues() - Method in class org.apache.spark.ml.attribute.NominalAttribute: Get the number of values, either from numValues or from values.
getObjectInspector(String, Option<Configuration>) - Static method in class org.apache.spark.sql.hive.orc.OrcFileOperator
getObjFieldValues(Object, Object[]) - Method in class org.apache.spark.serializer.SerializationDebugger.ObjectStreamClassMethods
getOffset() - Method in interface org.apache.spark.sql.sources.v2.reader.streaming.ContinuousInputPartitionReader: Get the offset of the current record, or the start offset if no records have been read.
getOffsetCol() - Method in interface org.apache.spark.ml.regression.GeneralizedLinearRegressionBase
getOldBoostingStrategy(Map<Object, Object>, Enumeration.Value) - Method in interface org.apache.spark.ml.tree.GBTParams: (private[ml]) Create a BoostingStrategy instance to use with the old API.
getOldDocConcentration() - Method in interface org.apache.spark.ml.clustering.LDAParams: Get docConcentration used by spark.mllib LDA
getOldImpurity() - Method in interface org.apache.spark.ml.tree.TreeClassifierParams: Convert new impurity to old impurity.
getOldImpurity() - Method in interface org.apache.spark.ml.tree.TreeRegressorParams: Convert new impurity to old impurity.
getOldLossType() - Method in interface org.apache.spark.ml.tree.GBTClassifierParams: (private[ml]) Convert new loss to old loss.
getOldLossType() - Method in interface org.apache.spark.ml.tree.GBTParams: Get old Gradient Boosting Loss type
getOldLossType() - Method in interface org.apache.spark.ml.tree.GBTRegressorParams: (private[ml]) Convert new loss to old loss.
getOldOptimizer() - Method in interface org.apache.spark.ml.clustering.LDAParams
getOldStrategy(Map<Object, Object>, int, Enumeration.Value, Impurity, double) - Method in interface org.apache.spark.ml.tree.DecisionTreeParams: (private[ml]) Create a Strategy instance to use with the old API.
getOldStrategy(Map<Object, Object>, int, Enumeration.Value, Impurity) - Method in interface org.apache.spark.ml.tree.TreeEnsembleParams: Create a Strategy instance to use with the old API.
getOldTopicConcentration() - Method in interface org.apache.spark.ml.clustering.LDAParams: Get topicConcentration used by spark.mllib LDA
getOptimizeDocConcentration() - Method in interface org.apache.spark.ml.clustering.LDAParams
getOptimizeDocConcentration() - Method in class org.apache.spark.mllib.clustering.OnlineLDAOptimizer: Optimize docConcentration, indicates whether docConcentration (Dirichlet parameter for document-topic distribution) will be optimized during training.
getOptimizer() - Method in interface org.apache.spark.ml.clustering.LDAParams
getOptimizer() - Method in class org.apache.spark.mllib.clustering.LDA: :: DeveloperApi ::
getOption(String) - Method in class org.apache.spark.SparkConf: Get a parameter as an Option
getOption(String) - Method in class org.apache.spark.sql.RuntimeConfig: Returns the value of Spark runtime configuration property for the given key.
getOption() - Method in interface org.apache.spark.sql.streaming.GroupState: Get the state value as a scala Option.
getOption() - Method in class org.apache.spark.streaming.State: Get the state as a scala.Option.
getOrCreate(SparkConf) - Static method in class org.apache.spark.SparkContext: This function may be used to get or instantiate a SparkContext and register it as a singleton object.
getOrCreate() - Static method in class org.apache.spark.SparkContext: This function may be used to get or instantiate a SparkContext and register it as a singleton object.
getOrCreate() - Method in class org.apache.spark.sql.SparkSession.Builder: Gets an existing SparkSession or, if there is no existing one, creates a new one based on the options set in this builder.
getOrCreate(SparkContext) - Static method in class org.apache.spark.sql.SQLContext: Deprecated.
Use SparkSession.builder instead. Since 2.0.0.
getOrCreate(String, Function0<JavaStreamingContext>) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
getOrCreate(String, Function0<JavaStreamingContext>, Configuration) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
getOrCreate(String, Function0<JavaStreamingContext>, Configuration, boolean) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
getOrCreate(String, Function0<StreamingContext>, Configuration, boolean) - Static method in class org.apache.spark.streaming.StreamingContext: Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
getOrCreateSparkSession(JavaSparkContext, Map<Object, Object>, boolean) - Static method in class org.apache.spark.sql.api.r.SQLUtils
getOrDefault(Param<T>) - Method in interface org.apache.spark.ml.param.Params: Gets the value of a param in the embedded param map or its default value.
getOrElse(Param<T>, T) - Method in class org.apache.spark.ml.param.ParamMap: Returns the value associated with a param or a default value.
getOrigin(Row) - Static method in class org.apache.spark.ml.image.ImageSchema: Gets the origin of the image
getOutputAttrGroupFromData(Dataset<?>, Seq<String>, Seq<String>, boolean) - Static method in class org.apache.spark.ml.feature.OneHotEncoderCommon: This method is called when we want to generate AttributeGroup from actual data for one-hot encoder.
getOutputCol() - Method in interface org.apache.spark.ml.param.shared.HasOutputCol
getOutputCols() - Method in interface org.apache.spark.ml.param.shared.HasOutputCols
getOutputSize(int) - Method in interface org.apache.spark.ml.ann.Layer: Returns the output size given the input size (not counting the stack size).
getOutputStream(String, Configuration) - Static method in class org.apache.spark.streaming.util.HdfsUtils
getP() - Method in class org.apache.spark.ml.feature.Normalizer
getParallelism() - Method in interface org.apache.spark.ml.param.shared.HasParallelism
getParam(String) - Method in interface org.apache.spark.ml.param.Params: Gets a param by its name.
getParents(int) - Method in class org.apache.spark.NarrowDependency: Get the parent partitions for a child partition.
getParents(int) - Method in class org.apache.spark.OneToOneDependency
getParents(int) - Method in class org.apache.spark.RangeDependency
getPartition(long, long, int) - Method in class org.apache.spark.graphx.PartitionStrategy.CanonicalRandomVertexCut$
getPartition(long, long, int) - Method in class org.apache.spark.graphx.PartitionStrategy.EdgePartition1D$
getPartition(long, long, int) - Method in class org.apache.spark.graphx.PartitionStrategy.EdgePartition2D$
getPartition(long, long, int) - Method in interface org.apache.spark.graphx.PartitionStrategy: Returns the partition number for a given edge.
getPartition(long, long, int) - Method in class org.apache.spark.graphx.PartitionStrategy.RandomVertexCut$
getPartition(Object) - Method in class org.apache.spark.HashPartitioner
getPartition(Object) - Method in class org.apache.spark.Partitioner
getPartition(Object) - Method in class org.apache.spark.RangePartitioner
getPartition(String, String, Map<String, String>) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Returns the specified partition, or throws `NoSuchPartitionException`.
getPartitionId() - Static method in class org.apache.spark.TaskContext: Returns the partition id of currently active TaskContext.
getPartitionNames(CatalogTable, Option<Map<String, String>>) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Returns the partition names for the given table that match the supplied partition spec.
getPartitionOption(String, String, Map<String, String>) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Returns the specified partition or None if it does not exist.
getPartitionOption(CatalogTable, Map<String, String>) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Returns the specified partition or None if it does not exist.
getPartitions() - Method in class org.apache.spark.api.r.BaseRRDD
getPartitions() - Method in class org.apache.spark.rdd.CoGroupedRDD
getPartitions() - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer
getPartitions() - Method in class org.apache.spark.rdd.HadoopRDD
getPartitions() - Method in class org.apache.spark.rdd.JdbcRDD
getPartitions() - Method in class org.apache.spark.rdd.NewHadoopRDD
getPartitions() - Method in class org.apache.spark.rdd.ShuffledRDD
getPartitions() - Method in class org.apache.spark.rdd.UnionRDD
getPartitions(String, String, Option<Map<String, String>>) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Returns the partitions for the given table that match the supplied partition spec.
getPartitions(CatalogTable, Option<Map<String, String>>) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Returns the partitions for the given table that match the supplied partition spec.
getPartitions() - Method in class org.apache.spark.status.LiveRDD
getPartitionsByFilter(CatalogTable, Seq<Expression>) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Returns partitions filtered by predicates for the given table.
getPath() - Method in class org.apache.spark.input.PortableDataStream
getPattern() - Method in class org.apache.spark.ml.feature.RegexTokenizer
GetPeers(BlockManagerId) - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetPeers
GetPeers$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetPeers$
getPercentile() - Method in interface org.apache.spark.ml.feature.ChiSqSelectorParams
getPersistentRDDs() - Method in class org.apache.spark.api.java.JavaSparkContext: Returns a Java map of JavaRDDs that have marked themselves as persistent via cache() call.
getPersistentRDDs() - Method in class org.apache.spark.SparkContext: Returns an immutable map of RDDs that have marked themselves as persistent via cache() call.
getPmml() - Method in interface org.apache.spark.mllib.pmml.export.PMMLModelExport
getPoissonSamplingFunction(RDD<Tuple2<K, V>>, Map<K, Object>, boolean, long, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils: Return the per partition sampling function used for sampling with replacement.
getPoolForName(String) - Method in class org.apache.spark.SparkContext: :: DeveloperApi :: Return the pool associated with the given name, if one exists
getPosition() - Method in class org.apache.spark.streaming.kinesis.KinesisInitialPositions.AtTimestamp
getPosition() - Method in class org.apache.spark.streaming.kinesis.KinesisInitialPositions.Latest
getPosition() - Method in class org.apache.spark.streaming.kinesis.KinesisInitialPositions.TrimHorizon
getPredictionCol() - Method in interface org.apache.spark.ml.param.shared.HasPredictionCol
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.HadoopRDD
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.NewHadoopRDD
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.UnionRDD
getPrimitiveNullWritableConstantObjectInspector() - Method in interface org.apache.spark.sql.hive.HiveInspectors
getProbabilityCol() - Method in interface org.apache.spark.ml.param.shared.HasProbabilityCol
getProcessName() - Static method in class org.apache.spark.util.Utils: Returns the name of this JVM process.
getPropertiesFromFile(String) - Static method in class org.apache.spark.util.Utils: Load properties present in the given file.
getQuantileCalculationStrategy() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
getQuantileProbabilities() - Method in interface org.apache.spark.ml.regression.AFTSurvivalRegressionParams
getQuantilesCol() - Method in interface org.apache.spark.ml.regression.AFTSurvivalRegressionParams
getRandomSample(Seq<T>, int, Random) - Static method in class org.apache.spark.storage.BlockReplicationUtils: Get a random sample of size m from the elems
getRank() - Method in interface org.apache.spark.ml.recommendation.ALSParams
getRatingCol() - Method in interface org.apache.spark.ml.recommendation.ALSParams
getRawPredictionCol() - Method in interface org.apache.spark.ml.param.shared.HasRawPredictionCol
getRDDStorageInfo() - Method in class org.apache.spark.SparkContext: :: DeveloperApi :: Return information about what RDDs are cached, if they are in mem or on disk, how much space they take, etc.
getReceiver() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream: Gets the receiver object that will be sent to the worker nodes to receive data.
getRegParam() - Method in interface org.apache.spark.ml.param.shared.HasRegParam
getRelativeError() - Method in interface org.apache.spark.ml.feature.QuantileDiscretizerBase
getResource(String) - Method in class org.apache.spark.util.ChildFirstURLClassLoader
getResources(String) - Method in class org.apache.spark.util.ChildFirstURLClassLoader
getRollingIntervalSecs(SparkConf, boolean) - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
getRootDirectory() - Static method in class org.apache.spark.SparkFiles: Get the root directory that contains files added through SparkContext.addFile().
getRow(int) - Method in class org.apache.spark.sql.vectorized.ColumnarBatch: Returns the row in this batch at `rowId`.
getRuns() - Method in class org.apache.spark.mllib.clustering.KMeans: Deprecated.
This has no effect and always returns 1. Since 2.1.0.
getScalingVec() - Method in class org.apache.spark.ml.feature.ElementwiseProduct
getSchedulableByName(String) - Method in interface org.apache.spark.scheduler.Schedulable
getSchedulingMode() - Method in class org.apache.spark.SparkContext: Return current scheduling mode
getSchemaQuery(String) - Method in class org.apache.spark.sql.jdbc.AggregatedDialect
getSchemaQuery(String) - Static method in class org.apache.spark.sql.jdbc.DB2Dialect
getSchemaQuery(String) - Static method in class org.apache.spark.sql.jdbc.DerbyDialect
getSchemaQuery(String) - Method in class org.apache.spark.sql.jdbc.JdbcDialect: The SQL query that should be used to discover the schema of a table.
getSchemaQuery(String) - Static method in class org.apache.spark.sql.jdbc.MsSqlServerDialect
getSchemaQuery(String) - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
getSchemaQuery(String) - Static method in class org.apache.spark.sql.jdbc.NoopDialect
getSchemaQuery(String) - Static method in class org.apache.spark.sql.jdbc.OracleDialect
getSchemaQuery(String) - Static method in class org.apache.spark.sql.jdbc.PostgresDialect
getSchemaQuery(String) - Static method in class org.apache.spark.sql.jdbc.TeradataDialect
getSeed() - Method in interface org.apache.spark.ml.param.shared.HasSeed
getSeed() - Method in class org.apache.spark.mllib.clustering.BisectingKMeans: Gets the random seed.
getSeed() - Method in class org.apache.spark.mllib.clustering.GaussianMixture: Return the random seed
getSeed() - Method in class org.apache.spark.mllib.clustering.KMeans: The random seed for cluster initialization.
getSeed() - Method in class org.apache.spark.mllib.clustering.LDA: Random seed for cluster initialization.
getSeed() - Method in class org.apache.spark.mllib.clustering.LocalLDAModel: Random seed for cluster initialization.
getSelectorType() - Method in interface org.apache.spark.ml.feature.ChiSqSelectorParams
getSeq(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i of array type as a Scala Seq.
getSeqOp(boolean, Map<K, Object>, org.apache.spark.util.random.StratifiedSamplingUtils.RandomDataGenerator, Option<Map<K, Object>>) - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils: Returns the function used by aggregate to collect sampling statistics for each partition.
getSequenceCol() - Method in class org.apache.spark.ml.fpm.PrefixSpan
getSessionConf(SparkSession) - Static method in class org.apache.spark.sql.api.r.SQLUtils
getShort(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i as a primitive short.
getShort(int) - Method in class org.apache.spark.sql.vectorized.ArrowColumnVector
getShort(int) - Method in class org.apache.spark.sql.vectorized.ColumnarArray
getShort(int) - Method in class org.apache.spark.sql.vectorized.ColumnarRow
getShort(int) - Method in class org.apache.spark.sql.vectorized.ColumnVector: Returns the short type value for rowId.
getShorts(int, int) - Method in class org.apache.spark.sql.vectorized.ColumnVector: Gets short type values from [rowId, rowId + count).
getShortWritable(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
getShortWritableConstantObjectInspector(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
getSimpleMessage() - Method in exception org.apache.spark.sql.AnalysisException
getSimpleName(Class<?>) - Static method in class org.apache.spark.util.Utils: Safer than Class obj's getSimpleName which may throw Malformed class name error in scala.
getSize() - Method in class org.apache.spark.ml.feature.VectorSizeHint: group getParam
getSizeAsBytes(String) - Method in class org.apache.spark.SparkConf: Get a size parameter as bytes; throws a NoSuchElementException if it's not set.
getSizeAsBytes(String, String) - Method in class org.apache.spark.SparkConf: Get a size parameter as bytes, falling back to a default if not set.
getSizeAsBytes(String, long) - Method in class org.apache.spark.SparkConf: Get a size parameter as bytes, falling back to a default if not set.
getSizeAsGb(String) - Method in class org.apache.spark.SparkConf: Get a size parameter as Gibibytes; throws a NoSuchElementException if it's not set.
getSizeAsGb(String, String) - Method in class org.apache.spark.SparkConf: Get a size parameter as Gibibytes, falling back to a default if not set.
getSizeAsKb(String) - Method in class org.apache.spark.SparkConf: Get a size parameter as Kibibytes; throws a NoSuchElementException if it's not set.
getSizeAsKb(String, String) - Method in class org.apache.spark.SparkConf: Get a size parameter as Kibibytes, falling back to a default if not set.
getSizeAsMb(String) - Method in class org.apache.spark.SparkConf: Get a size parameter as Mebibytes; throws a NoSuchElementException if it's not set.
getSizeAsMb(String, String) - Method in class org.apache.spark.SparkConf: Get a size parameter as Mebibytes, falling back to a default if not set.
getSizeForBlock(int) - Method in interface org.apache.spark.scheduler.MapStatus: Estimated size for the reduce block, in bytes.
getSizeInBytes() - Method in interface org.apache.spark.ml.linalg.Matrix: Gets the current size in bytes of this `Matrix`.
getSlotDescs() - Method in class org.apache.spark.serializer.SerializationDebugger.ObjectStreamClassMethods
getSmoothing() - Method in interface org.apache.spark.ml.classification.NaiveBayesParams
getSolver() - Method in interface org.apache.spark.ml.param.shared.HasSolver
getSortedTaskSetQueue() - Method in interface org.apache.spark.scheduler.Schedulable
getSparkClassLoader() - Static method in class org.apache.spark.util.Utils: Get the ClassLoader which loaded Spark.
getSparkHome() - Method in class org.apache.spark.api.java.JavaSparkContext: Get Spark's home location from either a value set through the constructor, or the spark.home Java property, or the SPARK_HOME environment variable (in that order of preference).
getSparkOrYarnConfig(SparkConf, String, String) - Static method in class org.apache.spark.util.Utils: Return the value of a config either through the SparkConf or the Hadoop configuration.
getSparseSizeInBytes(boolean) - Method in interface org.apache.spark.ml.linalg.Matrix: Gets the size of the minimal sparse representation of this `Matrix`.
getSplit() - Method in class org.apache.spark.ml.tree.DecisionTreeModelReadWrite.SplitData
getSplits() - Method in class org.apache.spark.ml.feature.Bucketizer
getSplitsArray() - Method in class org.apache.spark.ml.feature.Bucketizer
getSrcCol() - Method in interface org.apache.spark.ml.clustering.PowerIterationClusteringParams
getStageInfo(int) - Method in class org.apache.spark.api.java.JavaSparkStatusTracker: Returns stage information, or null if the stage info could not be found or was garbage collected.
getStageInfo(int) - Method in class org.apache.spark.SparkStatusTracker: Returns stage information, or None if the stage info could not be found or was garbage collected.
getStagePath(String, int, int, String) - Method in class org.apache.spark.ml.Pipeline.SharedReadWrite$: Get path for saving the given stage.
getStages() - Method in class org.apache.spark.ml.Pipeline
getStagingDir(Path, Configuration, String) - Method in interface org.apache.spark.sql.hive.execution.SaveAsHiveFile
getStandardization() - Method in interface org.apache.spark.ml.param.shared.HasStandardization
getStartOffset() - Static method in class org.apache.spark.rdd.InputFileBlockHolder: Returns the starting offset of the block currently being read, or -1 if it is unknown.
getStartOffset() - Method in interface org.apache.spark.sql.sources.v2.reader.streaming.ContinuousReader: Return the specified or inferred start offset for this reader.
getStartOffset() - Method in interface org.apache.spark.sql.sources.v2.reader.streaming.MicroBatchReader: Returns the specified (if explicitly set through setOffsetRange) or inferred start offset for this reader.
getStartTimeEpoch() - Method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
getState() - Method in interface org.apache.spark.launcher.SparkAppHandle: Returns the current application state.
getState() - Method in interface org.apache.spark.sql.hive.client.HiveClient: Return the associated Hive SessionState of this HiveClientImpl
getState() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: :: DeveloperApi ::
getState() - Method in class org.apache.spark.streaming.StreamingContext: :: DeveloperApi ::
getStatement() - Method in class org.apache.spark.ml.feature.SQLTransformer
getStderr(Process, long) - Static method in class org.apache.spark.util.Utils: Return the stderr of a process after waiting for the process to terminate.
getStepSize() - Method in interface org.apache.spark.ml.param.shared.HasStepSize
getStopWords() - Method in class org.apache.spark.ml.feature.StopWordsRemover
getStorageLevel() - Method in interface org.apache.spark.api.java.JavaRDDLike: Get the RDD's current storage level, or StorageLevel.NONE if none is set.
getStorageLevel() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
getStorageLevel() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
getStorageLevel() - Method in class org.apache.spark.rdd.RDD: Get the RDD's current storage level, or StorageLevel.NONE if none is set.
GetStorageStatus$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.GetStorageStatus$
getStrategy() - Method in interface org.apache.spark.ml.feature.ImputerParams
getString(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i as a String object.
getString(String) - Method in class org.apache.spark.sql.types.Metadata: Gets a String.
getStringArray(String) - Method in class org.apache.spark.sql.types.Metadata: Gets a String array.
getStringIndexerOrderType() - Method in interface org.apache.spark.ml.feature.RFormulaBase
getStringOrderType() - Method in interface org.apache.spark.ml.feature.StringIndexerBase
getStringWritable(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
getStringWritableConstantObjectInspector(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
getStruct(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i of struct type as a Row object.
getStruct(int, int) - Method in class org.apache.spark.sql.vectorized.ColumnarArray
getStruct(int, int) - Method in class org.apache.spark.sql.vectorized.ColumnarRow
getStruct(int) - Method in class org.apache.spark.sql.vectorized.ColumnVector: Returns the struct type value for rowId.
getSubsamplingRate() - Method in interface org.apache.spark.ml.clustering.LDAParams
getSubsamplingRate() - Method in interface org.apache.spark.ml.tree.TreeEnsembleParams
getSubsamplingRate() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
getSystemProperties() - Static method in class org.apache.spark.util.Utils: Returns the system properties map that is thread-safe to iterator over.
getTable(String) - Method in class org.apache.spark.sql.catalog.Catalog: Get the table or view with the specified name.
getTable(String, String) - Method in class org.apache.spark.sql.catalog.Catalog: Get the table or view with the specified name in the specified database.
getTable(String, String) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Returns the specified table, or throws `NoSuchTableException`.
getTableExistsQuery(String) - Method in class org.apache.spark.sql.jdbc.AggregatedDialect
getTableExistsQuery(String) - Static method in class org.apache.spark.sql.jdbc.DB2Dialect
getTableExistsQuery(String) - Static method in class org.apache.spark.sql.jdbc.DerbyDialect
getTableExistsQuery(String) - Method in class org.apache.spark.sql.jdbc.JdbcDialect: Get the SQL query that should be used to find if the given table exists.
getTableExistsQuery(String) - Static method in class org.apache.spark.sql.jdbc.MsSqlServerDialect
getTableExistsQuery(String) - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
getTableExistsQuery(String) - Static method in class org.apache.spark.sql.jdbc.NoopDialect
getTableExistsQuery(String) - Static method in class org.apache.spark.sql.jdbc.OracleDialect
getTableExistsQuery(String) - Static method in class org.apache.spark.sql.jdbc.PostgresDialect
getTableExistsQuery(String) - Static method in class org.apache.spark.sql.jdbc.TeradataDialect
getTableNames(SparkSession, String) - Static method in class org.apache.spark.sql.api.r.SQLUtils
getTableOption(String, String) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Returns the metadata for the specified table or None if it doesn't exist.
getTables(SparkSession, String) - Static method in class org.apache.spark.sql.api.r.SQLUtils
getTaskInfos() - Method in class org.apache.spark.BarrierTaskContext: :: Experimental :: Returns BarrierTaskInfo for all tasks in this barrier stage, ordered by partition ID.
getTau0() - Method in class org.apache.spark.mllib.clustering.OnlineLDAOptimizer: A (positive) learning parameter that downweights early iterations.
getThreadDump() - Static method in class org.apache.spark.util.Utils: Return a thread dump of all threads' stacktraces.
getThreadDumpForThread(long) - Static method in class org.apache.spark.util.Utils
getThreshold() - Method in class org.apache.spark.ml.classification.LogisticRegression
getThreshold() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
getThreshold() - Method in interface org.apache.spark.ml.classification.LogisticRegressionParams: Get threshold for binary classification.
getThreshold() - Method in class org.apache.spark.ml.feature.Binarizer
getThreshold() - Method in interface org.apache.spark.ml.param.shared.HasThreshold
getThreshold() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel: Returns the threshold (if any) used for converting raw prediction scores into 0/1 predictions.
getThreshold() - Method in class org.apache.spark.mllib.classification.SVMModel: Returns the threshold (if any) used for converting raw prediction scores into 0/1 predictions.
getThresholds() - Method in class org.apache.spark.ml.classification.LogisticRegression
getThresholds() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
getThresholds() - Method in interface org.apache.spark.ml.classification.LogisticRegressionParams: Get thresholds for binary or multiclass classification.
getThresholds() - Method in interface org.apache.spark.ml.param.shared.HasThresholds
getTimeAsMs(String) - Method in class org.apache.spark.SparkConf: Get a time parameter as milliseconds; throws a NoSuchElementException if it's not set.
getTimeAsMs(String, String) - Method in class org.apache.spark.SparkConf: Get a time parameter as milliseconds, falling back to a default if not set.
getTimeAsSeconds(String) - Method in class org.apache.spark.SparkConf: Get a time parameter as seconds; throws a NoSuchElementException if it's not set.
getTimeAsSeconds(String, String) - Method in class org.apache.spark.SparkConf: Get a time parameter as seconds, falling back to a default if not set.
getTimeMillis() - Method in interface org.apache.spark.util.Clock
getTimer(L) - Method in interface org.apache.spark.util.ListenerBus: Returns a CodaHale metrics Timer for measuring the listener's event processing time.
getTimestamp(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i of date type as java.sql.Timestamp.
getTimestamp() - Method in class org.apache.spark.streaming.kinesis.KinesisInitialPositions.AtTimestamp
getTimestampWritable(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
getTimestampWritableConstantObjectInspector(Object) - Method in interface org.apache.spark.sql.hive.HiveInspectors
getTimeZoneOffset() - Static method in class org.apache.spark.ui.UIUtils
GETTING_RESULT_TIME() - Static method in class org.apache.spark.status.TaskIndexNames
GETTING_RESULT_TIME() - Static method in class org.apache.spark.ui.jobs.TaskDetailsClassNames
GETTING_RESULT_TIME() - Static method in class org.apache.spark.ui.ToolTips
gettingResult() - Method in class org.apache.spark.scheduler.TaskInfo
gettingResultTime() - Method in class org.apache.spark.scheduler.TaskInfo: The time when the task started remotely getting the result.
gettingResultTime() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
gettingResultTime(TaskData) - Static method in class org.apache.spark.status.AppStatusUtils
gettingResultTime(long, long, long) - Static method in class org.apache.spark.status.AppStatusUtils
getTol() - Method in interface org.apache.spark.ml.param.shared.HasTol
getToLowercase() - Method in class org.apache.spark.ml.feature.RegexTokenizer
getTopicConcentration() - Method in interface org.apache.spark.ml.clustering.LDAParams
getTopicConcentration() - Method in class org.apache.spark.mllib.clustering.LDA: Concentration parameter (commonly named "beta" or "eta") for the prior placed on topics' distributions over terms.
getTopicDistributionCol() - Method in interface org.apache.spark.ml.clustering.LDAParams
getTopologyForHost(String) - Method in class org.apache.spark.storage.DefaultTopologyMapper
getTopologyForHost(String) - Method in class org.apache.spark.storage.FileBasedTopologyMapper
getTopologyForHost(String) - Method in class org.apache.spark.storage.TopologyMapper: Gets the topology information given the host name
getTrainRatio() - Method in interface org.apache.spark.ml.tuning.TrainValidationSplitParams
getTreeStrategy() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
getTruncateQuery(String, Option<Object>) - Method in class org.apache.spark.sql.jdbc.AggregatedDialect: The SQL query used to truncate a table.
getTruncateQuery(String) - Static method in class org.apache.spark.sql.jdbc.DB2Dialect
getTruncateQuery(String, Option<Object>) - Static method in class org.apache.spark.sql.jdbc.DB2Dialect
getTruncateQuery(String) - Static method in class org.apache.spark.sql.jdbc.DerbyDialect
getTruncateQuery(String, Option<Object>) - Static method in class org.apache.spark.sql.jdbc.DerbyDialect
getTruncateQuery(String) - Method in class org.apache.spark.sql.jdbc.JdbcDialect: The SQL query that should be used to truncate a table.
getTruncateQuery(String, Option<Object>) - Method in class org.apache.spark.sql.jdbc.JdbcDialect: The SQL query that should be used to truncate a table.
getTruncateQuery(String) - Static method in class org.apache.spark.sql.jdbc.MsSqlServerDialect
getTruncateQuery(String, Option<Object>) - Static method in class org.apache.spark.sql.jdbc.MsSqlServerDialect
getTruncateQuery(String) - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
getTruncateQuery(String, Option<Object>) - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
getTruncateQuery(String) - Static method in class org.apache.spark.sql.jdbc.NoopDialect
getTruncateQuery(String, Option<Object>) - Static method in class org.apache.spark.sql.jdbc.NoopDialect
getTruncateQuery(String, Option<Object>) - Static method in class org.apache.spark.sql.jdbc.OracleDialect: The SQL query used to truncate a table.
getTruncateQuery(String, Option<Object>) - Static method in class org.apache.spark.sql.jdbc.PostgresDialect: The SQL query used to truncate a table.
getTruncateQuery(String, Option<Object>) - Static method in class org.apache.spark.sql.jdbc.TeradataDialect: The SQL query used to truncate a table.
getTruncateQuery$default$2() - Static method in class org.apache.spark.sql.jdbc.DB2Dialect
getTruncateQuery$default$2() - Static method in class org.apache.spark.sql.jdbc.DerbyDialect
getTruncateQuery$default$2() - Static method in class org.apache.spark.sql.jdbc.MsSqlServerDialect
getTruncateQuery$default$2() - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
getTruncateQuery$default$2() - Static method in class org.apache.spark.sql.jdbc.NoopDialect
getTruncateQuery$default$2() - Static method in class org.apache.spark.sql.jdbc.OracleDialect
getTruncateQuery$default$2() - Static method in class org.apache.spark.sql.jdbc.PostgresDialect
getTruncateQuery$default$2() - Static method in class org.apache.spark.sql.jdbc.TeradataDialect
getUDTFor(String) - Static method in class org.apache.spark.sql.types.UDTRegistration: Returns the Class of UserDefinedType for the name of a given user class.
getUidMap(Params) - Static method in class org.apache.spark.ml.util.MetaAlgorithmReadWrite: Examine the given estimator (which may be a compound estimator) and extract a mapping from UIDs to corresponding Params instances.
getUiRoot(ServletContext) - Static method in class org.apache.spark.status.api.v1.UIRootFromServletContext
getUpperBound(double, long, double) - Static method in class org.apache.spark.util.random.BinomialBounds: Returns a threshold p such that if we conduct n Bernoulli trials with success rate = p, it is very unlikely to have less than fraction * n successes.
getUpperBound(double) - Static method in class org.apache.spark.util.random.PoissonBounds: Returns a lambda such that Pr[X < s] is very small, where X ~ Pois(lambda).
getUpperBoundsOnCoefficients() - Method in interface org.apache.spark.ml.classification.LogisticRegressionParams
getUpperBoundsOnIntercepts() - Method in interface org.apache.spark.ml.classification.LogisticRegressionParams
getUsedTimeMs(long) - Static method in class org.apache.spark.util.Utils: Return the string to tell how long has passed in milliseconds.
getUseNodeIdCache() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
getUserCol() - Method in interface org.apache.spark.ml.recommendation.ALSModelParams
getUserJars(SparkConf) - Static method in class org.apache.spark.util.Utils: Return the jar files pointed by the "spark.jars" property.
getUTF8String(int) - Method in class org.apache.spark.sql.vectorized.ArrowColumnVector
getUTF8String(int) - Method in class org.apache.spark.sql.vectorized.ColumnarArray
getUTF8String(int) - Method in class org.apache.spark.sql.vectorized.ColumnarRow
getUTF8String(int) - Method in class org.apache.spark.sql.vectorized.ColumnVector: Returns the string type value for rowId.
getValidationIndicatorCol() - Method in interface org.apache.spark.ml.param.shared.HasValidationIndicatorCol
getValidationTol() - Method in interface org.apache.spark.ml.tree.GBTParams
getValidationTol() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
getValue(int) - Method in class org.apache.spark.ml.attribute.NominalAttribute: Gets a value given its index.
getValuesMap(Seq<String>) - Method in interface org.apache.spark.sql.Row: Returns a Map consisting of names and values for the requested fieldNames For primitive types if value is null it returns 'zero value' specific for primitive ie.
getVarianceCol() - Method in interface org.apache.spark.ml.param.shared.HasVarianceCol
getVariancePower() - Method in interface org.apache.spark.ml.regression.GeneralizedLinearRegressionBase
getVectors() - Method in class org.apache.spark.ml.feature.Word2VecModel: Returns a dataframe with two fields, "word" and "vector", with "word" being a String and and the vector the DenseVector that it is mapped to.
getVectors() - Method in class org.apache.spark.mllib.feature.Word2VecModel: Returns a map of words to their vector representations.
getVectorSize() - Method in interface org.apache.spark.ml.feature.Word2VecBase
getVocabSize() - Method in interface org.apache.spark.ml.feature.CountVectorizerParams
getWeightCol() - Method in interface org.apache.spark.ml.param.shared.HasWeightCol
getWidth(Row) - Static method in class org.apache.spark.ml.image.ImageSchema: Gets the width of the image
getWindowSize() - Method in interface org.apache.spark.ml.feature.Word2VecBase
getWithMean() - Method in interface org.apache.spark.ml.feature.StandardScalerParams
getWithStd() - Method in interface org.apache.spark.ml.feature.StandardScalerParams
Gini - Class in org.apache.spark.mllib.tree.impurity: Class for calculating the Gini impurity (http://en.wikipedia.org/wiki/Decision_tree_learning#Gini_impurity) during multiclass classification.
Gini() - Constructor for class org.apache.spark.mllib.tree.impurity.Gini
GLMClassificationModel - Class in org.apache.spark.mllib.classification.impl: Helper class for import/export of GLM classification models.
GLMClassificationModel() - Constructor for class org.apache.spark.mllib.classification.impl.GLMClassificationModel
GLMClassificationModel.SaveLoadV1_0$ - Class in org.apache.spark.mllib.classification.impl
GLMClassificationModel.SaveLoadV1_0$.Data - Class in org.apache.spark.mllib.classification.impl: Model data for import/export
GLMClassificationModel.SaveLoadV1_0$.Data$ - Class in org.apache.spark.mllib.classification.impl
GLMRegressionModel - Class in org.apache.spark.mllib.regression.impl: Helper methods for import/export of GLM regression models.
GLMRegressionModel() - Constructor for class org.apache.spark.mllib.regression.impl.GLMRegressionModel
GLMRegressionModel.SaveLoadV1_0$ - Class in org.apache.spark.mllib.regression.impl
GLMRegressionModel.SaveLoadV1_0$.Data - Class in org.apache.spark.mllib.regression.impl: Model data for model import/export
GLMRegressionModel.SaveLoadV1_0$.Data$ - Class in org.apache.spark.mllib.regression.impl
glom() - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an RDD created by coalescing all elements within each partition into an array.
glom() - Method in class org.apache.spark.rdd.RDD: Return an RDD created by coalescing all elements within each partition into an array.
glom() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying glom() to each RDD of this DStream.
glom() - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD is generated by applying glom() to each RDD of this DStream.
goButtonFormPath() - Method in interface org.apache.spark.ui.PagedTable: Returns the submission path for the "go to page #" form.
goodnessOfFit() - Method in class org.apache.spark.mllib.stat.test.ChiSqTest.NullHypothesis$
grad(DenseMatrix<Object>, DenseMatrix<Object>, DenseVector<Object>) - Method in interface org.apache.spark.ml.ann.LayerModel: Computes the gradient.
grad() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
gradient() - Method in interface org.apache.spark.ml.optim.aggregator.DifferentiableLossAggregator: The current weighted averaged gradient.
gradient() - Method in class org.apache.spark.ml.regression.AFTAggregator
Gradient - Class in org.apache.spark.mllib.optimization: :: DeveloperApi :: Class used to compute the gradient for a loss function, given a single data point.
Gradient() - Constructor for class org.apache.spark.mllib.optimization.Gradient
gradient(double, double) - Static method in class org.apache.spark.mllib.tree.loss.AbsoluteError: Method to calculate the gradients for the gradient boosting calculation for least absolute error calculation.
gradient(double, double) - Static method in class org.apache.spark.mllib.tree.loss.LogLoss: Method to calculate the loss gradients for the gradient boosting calculation for binary classification The gradient with respect to F(x) is: - 4 y / (1 + exp(2 y F(x)))
gradient(double, double) - Method in interface org.apache.spark.mllib.tree.loss.Loss: Method to calculate the gradients for the gradient boosting calculation.
gradient(double, double) - Static method in class org.apache.spark.mllib.tree.loss.SquaredError: Method to calculate the gradients for the gradient boosting calculation for least squares error calculation.
GradientBoostedTrees - Class in org.apache.spark.ml.tree.impl
GradientBoostedTrees() - Constructor for class org.apache.spark.ml.tree.impl.GradientBoostedTrees
GradientBoostedTrees - Class in org.apache.spark.mllib.tree: A class that implements Stochastic Gradient Boosting for regression and binary classification.
GradientBoostedTrees(BoostingStrategy) - Constructor for class org.apache.spark.mllib.tree.GradientBoostedTrees
GradientBoostedTreesModel - Class in org.apache.spark.mllib.tree.model: Represents a gradient boosted trees model.
GradientBoostedTreesModel(Enumeration.Value, DecisionTreeModel[], double[]) - Constructor for class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
GradientDescent - Class in org.apache.spark.mllib.optimization: Class used to solve an optimization problem using Gradient Descent.
gradientSumArray() - Method in interface org.apache.spark.ml.optim.aggregator.DifferentiableLossAggregator: Array of gradient values that are mutated when new instances are added to the aggregator.
Graph<VD,ED> - Class in org.apache.spark.graphx: The Graph abstractly represents a graph with arbitrary objects associated with vertices and edges.
GraphGenerators - Class in org.apache.spark.graphx.util: A collection of graph generating functions.
GraphGenerators() - Constructor for class org.apache.spark.graphx.util.GraphGenerators
GraphImpl<VD,ED> - Class in org.apache.spark.graphx.impl: An implementation of Graph to support computation on graphs.
GraphLoader - Class in org.apache.spark.graphx: Provides utilities for loading Graphs from files.
GraphLoader() - Constructor for class org.apache.spark.graphx.GraphLoader
GraphOps<VD,ED> - Class in org.apache.spark.graphx: Contains additional functionality for Graph.
GraphOps(Graph<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Constructor for class org.apache.spark.graphx.GraphOps
graphToGraphOps(Graph<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.Graph: Implicitly extracts the GraphOps member from a graph.
GraphXUtils - Class in org.apache.spark.graphx
GraphXUtils() - Constructor for class org.apache.spark.graphx.GraphXUtils
greater(Duration) - Method in class org.apache.spark.streaming.Duration
greater(Time) - Method in class org.apache.spark.streaming.Time
greaterEq(Duration) - Method in class org.apache.spark.streaming.Duration
greaterEq(Time) - Method in class org.apache.spark.streaming.Time
GreaterThan - Class in org.apache.spark.sql.sources: A filter that evaluates to true iff the attribute evaluates to a value greater than value.
GreaterThan(String, Object) - Constructor for class org.apache.spark.sql.sources.GreaterThan
GreaterThanOrEqual - Class in org.apache.spark.sql.sources: A filter that evaluates to true iff the attribute evaluates to a value greater than or equal to value.
GreaterThanOrEqual(String, Object) - Constructor for class org.apache.spark.sql.sources.GreaterThanOrEqual
greatest(Column...) - Static method in class org.apache.spark.sql.functions: Returns the greatest value of the list of values, skipping null values.
greatest(String, String...) - Static method in class org.apache.spark.sql.functions: Returns the greatest value of the list of column names, skipping null values.
greatest(Seq<Column>) - Static method in class org.apache.spark.sql.functions: Returns the greatest value of the list of values, skipping null values.
greatest(String, Seq<String>) - Static method in class org.apache.spark.sql.functions: Returns the greatest value of the list of column names, skipping null values.
gridGraph(SparkContext, int, int) - Static method in class org.apache.spark.graphx.util.GraphGenerators: Create rows by cols grid graph with each vertex connected to its row+1 and col+1 neighbors.
groupArr() - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer
groupBy(Function<T, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an RDD of grouped elements.
groupBy(Function<T, U>, int) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an RDD of grouped elements.
groupBy(Function1<T, K>, ClassTag<K>) - Method in class org.apache.spark.rdd.RDD: Return an RDD of grouped items.
groupBy(Function1<T, K>, int, ClassTag<K>) - Method in class org.apache.spark.rdd.RDD: Return an RDD of grouped elements.
groupBy(Function1<T, K>, Partitioner, ClassTag<K>, Ordering<K>) - Method in class org.apache.spark.rdd.RDD: Return an RDD of grouped items.
groupBy(Column...) - Method in class org.apache.spark.sql.Dataset: Groups the Dataset using the specified columns, so we can run aggregation on them.
groupBy(String, String...) - Method in class org.apache.spark.sql.Dataset: Groups the Dataset using the specified columns, so that we can run aggregation on them.
groupBy(Seq<Column>) - Method in class org.apache.spark.sql.Dataset: Groups the Dataset using the specified columns, so we can run aggregation on them.
groupBy(String, Seq<String>) - Method in class org.apache.spark.sql.Dataset: Groups the Dataset using the specified columns, so that we can run aggregation on them.
groupByKey(Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Group the values for each key in the RDD into a single sequence.
groupByKey(int) - Method in class org.apache.spark.api.java.JavaPairRDD: Group the values for each key in the RDD into a single sequence.
groupByKey() - Method in class org.apache.spark.api.java.JavaPairRDD: Group the values for each key in the RDD into a single sequence.
groupByKey(Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions: Group the values for each key in the RDD into a single sequence.
groupByKey(int) - Method in class org.apache.spark.rdd.PairRDDFunctions: Group the values for each key in the RDD into a single sequence.
groupByKey() - Method in class org.apache.spark.rdd.PairRDDFunctions: Group the values for each key in the RDD into a single sequence.
groupByKey(Function1<T, K>, Encoder<K>) - Method in class org.apache.spark.sql.Dataset: :: Experimental :: (Scala-specific) Returns a KeyValueGroupedDataset where the data is grouped by the given key func.
groupByKey(MapFunction<T, K>, Encoder<K>) - Method in class org.apache.spark.sql.Dataset: :: Experimental :: (Java-specific) Returns a KeyValueGroupedDataset where the data is grouped by the given key func.
groupByKey() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying groupByKey to each RDD.
groupByKey(int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying groupByKey to each RDD.
groupByKey(Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying groupByKey on each RDD of this DStream.
groupByKey() - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying groupByKey to each RDD.
groupByKey(int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying groupByKey to each RDD.
groupByKey(Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying groupByKey on each RDD.
groupByKeyAndWindow(Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying groupByKey over a sliding window.
groupByKeyAndWindow(Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying groupByKey over a sliding window.
groupByKeyAndWindow(Duration, Duration, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying groupByKey over a sliding window on this DStream.
groupByKeyAndWindow(Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying groupByKey over a sliding window on this DStream.
groupByKeyAndWindow(Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying groupByKey over a sliding window.
groupByKeyAndWindow(Duration, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying groupByKey over a sliding window.
groupByKeyAndWindow(Duration, Duration, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying groupByKey over a sliding window on this DStream.
groupByKeyAndWindow(Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Create a new DStream by applying groupByKey over a sliding window on this DStream.
GroupByType$() - Constructor for class org.apache.spark.sql.RelationalGroupedDataset.GroupByType$
groupEdges(Function2<ED, ED, ED>) - Method in class org.apache.spark.graphx.Graph: Merges multiple edges between two vertices into a single edge.
groupEdges(Function2<ED, ED, ED>) - Method in class org.apache.spark.graphx.impl.GraphImpl
groupHash() - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer
grouping(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: indicates whether a specified column in a GROUP BY list is aggregated or not, returns 1 for aggregated or 0 for not aggregated in the result set.
grouping(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: indicates whether a specified column in a GROUP BY list is aggregated or not, returns 1 for aggregated or 0 for not aggregated in the result set.
grouping_id(Seq<Column>) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the level of grouping, equals to
grouping_id(String, Seq<String>) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the level of grouping, equals to
GroupMappingServiceProvider - Interface in org.apache.spark.security: This Spark trait is used for mapping a given userName to a set of groups which it belongs to.
GroupState<S> - Interface in org.apache.spark.sql.streaming: :: Experimental ::
GroupStateTimeout - Class in org.apache.spark.sql.streaming: Represents the type of timeouts possible for the Dataset operations `mapGroupsWithState` and `flatMapGroupsWithState`.
GroupStateTimeout() - Constructor for class org.apache.spark.sql.streaming.GroupStateTimeout
groupWith(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD: Alias for cogroup.
groupWith(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>) - Method in class org.apache.spark.api.java.JavaPairRDD: Alias for cogroup.
groupWith(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>) - Method in class org.apache.spark.api.java.JavaPairRDD: Alias for cogroup.
groupWith(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Alias for cogroup.
groupWith(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Alias for cogroup.
groupWith(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Alias for cogroup.
gt(double) - Static method in class org.apache.spark.ml.param.ParamValidators: Check if value is greater than lowerBound
gt(Object) - Method in class org.apache.spark.sql.Column: Greater than.
gtEq(double) - Static method in class org.apache.spark.ml.param.ParamValidators: Check if value is greater than or equal to lowerBound
guard(Function0<Parsers.Parser<T>>) - Static method in class org.apache.spark.ml.feature.RFormulaParser

H

hadoopConfiguration() - Method in class org.apache.spark.api.java.JavaSparkContext: Returns the Hadoop configuration used for the Hadoop code (e.g.
hadoopConfiguration() - Method in class org.apache.spark.SparkContext: A default Hadoop Configuration for the Hadoop code (e.g.
hadoopDelegationCreds() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.SparkAppConfig
hadoopFile(String, Class<F>, Class<K>, Class<V>, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Get an RDD for a Hadoop file with an arbitrary InputFormat.
hadoopFile(String, Class<F>, Class<K>, Class<V>) - Method in class org.apache.spark.api.java.JavaSparkContext: Get an RDD for a Hadoop file with an arbitrary InputFormat
hadoopFile(String, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, int) - Method in class org.apache.spark.SparkContext: Get an RDD for a Hadoop file with an arbitrary InputFormat
hadoopFile(String, int, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.SparkContext: Smarter version of hadoopFile() that uses class tags to figure out the classes of keys, values and the InputFormat so that users don't need to pass them directly.
hadoopFile(String, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.SparkContext: Smarter version of hadoopFile() that uses class tags to figure out the classes of keys, values and the InputFormat so that users don't need to pass them directly.
HadoopMapPartitionsWithSplitRDD$() - Constructor for class org.apache.spark.rdd.HadoopRDD.HadoopMapPartitionsWithSplitRDD$
HadoopMapRedCommitProtocol - Class in org.apache.spark.internal.io: An FileCommitProtocol implementation backed by an underlying Hadoop OutputCommitter (from the old mapred API).
HadoopMapRedCommitProtocol(String, String) - Constructor for class org.apache.spark.internal.io.HadoopMapRedCommitProtocol
HadoopMapReduceCommitProtocol - Class in org.apache.spark.internal.io: An FileCommitProtocol implementation backed by an underlying Hadoop OutputCommitter (from the newer mapreduce API, not the old mapred API).
HadoopMapReduceCommitProtocol(String, String, boolean) - Constructor for class org.apache.spark.internal.io.HadoopMapReduceCommitProtocol
hadoopRDD(JobConf, Class<F>, Class<K>, Class<V>, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Get an RDD for a Hadoop-readable dataset from a Hadoop JobConf giving its InputFormat and any other necessary info (e.g.
hadoopRDD(JobConf, Class<F>, Class<K>, Class<V>) - Method in class org.apache.spark.api.java.JavaSparkContext: Get an RDD for a Hadoop-readable dataset from a Hadoop JobConf giving its InputFormat and any other necessary info (e.g.
HadoopRDD<K,V> - Class in org.apache.spark.rdd: :: DeveloperApi :: An RDD that provides core functionality for reading data stored in Hadoop (e.g., files in HDFS, sources in HBase, or S3), using the older MapReduce API (org.apache.hadoop.mapred).
HadoopRDD(SparkContext, Broadcast<org.apache.spark.util.SerializableConfiguration>, Option<Function1<JobConf, BoxedUnit>>, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, int) - Constructor for class org.apache.spark.rdd.HadoopRDD
HadoopRDD(SparkContext, JobConf, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, int) - Constructor for class org.apache.spark.rdd.HadoopRDD
hadoopRDD(JobConf, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, int) - Method in class org.apache.spark.SparkContext: Get an RDD for a Hadoop-readable dataset from a Hadoop JobConf given its InputFormat and other necessary info (e.g.
HadoopRDD.HadoopMapPartitionsWithSplitRDD$ - Class in org.apache.spark.rdd
HadoopWriteConfigUtil<K,V> - Class in org.apache.spark.internal.io: Interface for create output format/committer/writer used during saving an RDD using a Hadoop OutputFormat (both from the old mapred API and the new mapreduce API)
HadoopWriteConfigUtil(ClassTag<V>) - Constructor for class org.apache.spark.internal.io.HadoopWriteConfigUtil
hammingLoss() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics: Returns Hamming-loss
handleInvalid() - Method in class org.apache.spark.ml.feature.Bucketizer: Param for how to handle invalid entries.
handleInvalid() - Method in interface org.apache.spark.ml.feature.OneHotEncoderBase: Param for how to handle invalid data during transform().
handleInvalid() - Method in interface org.apache.spark.ml.feature.QuantileDiscretizerBase: Param for how to handle invalid entries.
handleInvalid() - Method in interface org.apache.spark.ml.feature.RFormulaBase: Param for how to handle invalid data (unseen or NULL values) in features and label column of string type.
handleInvalid() - Method in interface org.apache.spark.ml.feature.StringIndexerBase: Param for how to handle invalid data (unseen labels or NULL values).
handleInvalid() - Method in class org.apache.spark.ml.feature.VectorAssembler: Param for how to handle invalid data (NULL values).
handleInvalid() - Method in interface org.apache.spark.ml.feature.VectorIndexerParams: Param for how to handle invalid data (unseen labels or NULL values).
handleInvalid() - Method in class org.apache.spark.ml.feature.VectorSizeHint: Param for how to handle invalid entries.
handleInvalid() - Method in interface org.apache.spark.ml.param.shared.HasHandleInvalid: Param for how to handle invalid entries.
hasAccumulators(StageData) - Static method in class org.apache.spark.ui.jobs.ApiHelper
HasAggregationDepth - Interface in org.apache.spark.ml.param.shared: Trait for shared param aggregationDepth (default: 2).
hasAttr(String) - Method in class org.apache.spark.ml.attribute.AttributeGroup: Test whether this attribute group contains a specific attribute.
hasBytesSpilled(StageData) - Static method in class org.apache.spark.ui.jobs.ApiHelper
HasCachedBlocks(String) - Constructor for class org.apache.spark.storage.BlockManagerMessages.HasCachedBlocks
HasCachedBlocks$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.HasCachedBlocks$
hasCachedSerializedBroadcast() - Method in class org.apache.spark.ShuffleStatus
HasCheckpointInterval - Interface in org.apache.spark.ml.param.shared: Trait for shared param checkpointInterval.
HasCollectSubModels - Interface in org.apache.spark.ml.param.shared: Trait for shared param collectSubModels (default: false).
hasDefault(Param<T>) - Method in interface org.apache.spark.ml.param.Params: Tests whether the input param has a default value set.
HasDistanceMeasure - Interface in org.apache.spark.ml.param.shared: Trait for shared param distanceMeasure (default: org.apache.spark.mllib.clustering.DistanceMeasure.EUCLIDEAN).
HasElasticNetParam - Interface in org.apache.spark.ml.param.shared: Trait for shared param elasticNetParam.
HasFeaturesCol - Interface in org.apache.spark.ml.param.shared: Trait for shared param featuresCol (default: "features").
HasFitIntercept - Interface in org.apache.spark.ml.param.shared: Trait for shared param fitIntercept (default: true).
hash(Column...) - Static method in class org.apache.spark.sql.functions: Calculates the hash code of given columns, and returns the result as an int column.
hash(Seq<Column>) - Static method in class org.apache.spark.sql.functions: Calculates the hash code of given columns, and returns the result as an int column.
HasHandleInvalid - Interface in org.apache.spark.ml.param.shared: Trait for shared param handleInvalid.
hashCode() - Method in class org.apache.spark.api.java.Optional
hashCode() - Method in class org.apache.spark.graphx.EdgeDirection
hashCode() - Method in class org.apache.spark.HashPartitioner
hashCode() - Method in class org.apache.spark.ml.attribute.AttributeGroup
hashCode() - Method in class org.apache.spark.ml.attribute.BinaryAttribute
hashCode() - Method in class org.apache.spark.ml.attribute.NominalAttribute
hashCode() - Method in class org.apache.spark.ml.attribute.NumericAttribute
hashCode() - Method in class org.apache.spark.ml.linalg.DenseMatrix
hashCode() - Method in class org.apache.spark.ml.linalg.DenseVector
hashCode() - Method in class org.apache.spark.ml.linalg.SparseMatrix
hashCode() - Method in class org.apache.spark.ml.linalg.SparseVector
hashCode() - Method in interface org.apache.spark.ml.linalg.Vector: Returns a hash code value for the vector.
hashCode() - Method in class org.apache.spark.ml.param.Param
hashCode() - Method in class org.apache.spark.ml.tree.CategoricalSplit
hashCode() - Method in class org.apache.spark.ml.tree.ContinuousSplit
hashCode() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
hashCode() - Method in class org.apache.spark.mllib.linalg.DenseVector
hashCode() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
hashCode() - Method in class org.apache.spark.mllib.linalg.SparseVector
hashCode() - Method in interface org.apache.spark.mllib.linalg.Vector: Returns a hash code value for the vector.
hashCode() - Method in class org.apache.spark.mllib.linalg.VectorUDT
hashCode() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
hashCode() - Method in class org.apache.spark.mllib.tree.model.Predict
hashCode() - Method in class org.apache.spark.partial.BoundedDouble
hashCode() - Method in interface org.apache.spark.Partition
hashCode() - Method in class org.apache.spark.RangePartitioner
hashCode() - Method in class org.apache.spark.scheduler.cluster.ExecutorInfo
hashCode() - Method in class org.apache.spark.scheduler.InputFormatInfo
hashCode() - Method in class org.apache.spark.scheduler.SplitInfo
hashCode() - Method in class org.apache.spark.sql.Column
hashCode() - Method in interface org.apache.spark.sql.Row
hashCode() - Method in class org.apache.spark.sql.sources.In
hashCode() - Method in class org.apache.spark.sql.sources.v2.reader.streaming.Offset
hashCode() - Method in class org.apache.spark.sql.types.Decimal
hashCode() - Method in class org.apache.spark.sql.types.Metadata
hashCode() - Method in class org.apache.spark.sql.types.StructType
hashCode() - Method in class org.apache.spark.storage.BlockManagerId
hashCode() - Method in class org.apache.spark.storage.StorageLevel
HashingTF - Class in org.apache.spark.ml.feature: Maps a sequence of terms to their term frequencies using the hashing trick.
HashingTF(String) - Constructor for class org.apache.spark.ml.feature.HashingTF
HashingTF() - Constructor for class org.apache.spark.ml.feature.HashingTF
HashingTF - Class in org.apache.spark.mllib.feature: Maps a sequence of terms to their term frequencies using the hashing trick.
HashingTF(int) - Constructor for class org.apache.spark.mllib.feature.HashingTF
HashingTF() - Constructor for class org.apache.spark.mllib.feature.HashingTF
HashPartitioner - Class in org.apache.spark: A Partitioner that implements hash-based partitioning using Java's Object.hashCode.
HashPartitioner(int) - Constructor for class org.apache.spark.HashPartitioner
hasInput(StageData) - Static method in class org.apache.spark.ui.jobs.ApiHelper
HasInputCol - Interface in org.apache.spark.ml.param.shared: Trait for shared param inputCol.
HasInputCols - Interface in org.apache.spark.ml.param.shared: Trait for shared param inputCols.
hasInputOutputFormat() - Method in class org.apache.spark.sql.hive.execution.HiveOptions
hasLabelCol(StructType) - Method in interface org.apache.spark.ml.feature.RFormulaBase
HasLabelCol - Interface in org.apache.spark.ml.param.shared: Trait for shared param labelCol (default: "label").
hasLinkPredictionCol() - Method in interface org.apache.spark.ml.regression.GeneralizedLinearRegressionBase: Checks whether we should output link prediction.
HasLoss - Interface in org.apache.spark.ml.param.shared: Trait for shared param loss.
HasMaxIter - Interface in org.apache.spark.ml.param.shared: Trait for shared param maxIter.
hasMemoryInfo() - Method in class org.apache.spark.status.LiveExecutor
hasNext() - Method in class org.apache.spark.InterruptibleIterator
hasNull() - Method in class org.apache.spark.sql.vectorized.ArrowColumnVector
hasNull() - Method in class org.apache.spark.sql.vectorized.ColumnVector: Returns true if this column vector contains any null values.
hasOffsetCol() - Method in interface org.apache.spark.ml.regression.GeneralizedLinearRegressionBase: Checks whether offset column is set and nonempty.
hasOutput(StageData) - Static method in class org.apache.spark.ui.jobs.ApiHelper
HasOutputCol - Interface in org.apache.spark.ml.param.shared: Trait for shared param outputCol (default: uid + "__output").
HasOutputCols - Interface in org.apache.spark.ml.param.shared: Trait for shared param outputCols.
HasParallelism - Interface in org.apache.spark.ml.param.shared: Trait to define a level of parallelism for algorithms that are able to use multithreaded execution, and provide a thread-pool based execution context.
hasParam(String) - Method in interface org.apache.spark.ml.param.Params: Tests whether this instance contains a param with a given name.
hasParent() - Method in class org.apache.spark.ml.Model: Indicates whether this Model has a corresponding parent.
HasPredictionCol - Interface in org.apache.spark.ml.param.shared: Trait for shared param predictionCol (default: "prediction").
HasProbabilityCol - Interface in org.apache.spark.ml.param.shared: Trait for shared param probabilityCol (default: "probability").
hasQuantilesCol() - Method in interface org.apache.spark.ml.regression.AFTSurvivalRegressionParams: Checks whether the input has quantiles column name.
HasRawPredictionCol - Interface in org.apache.spark.ml.param.shared: Trait for shared param rawPredictionCol (default: "rawPrediction").
HasRegParam - Interface in org.apache.spark.ml.param.shared: Trait for shared param regParam.
hasRootAsShutdownDeleteDir(File) - Static method in class org.apache.spark.util.ShutdownHookManager
HasSeed - Interface in org.apache.spark.ml.param.shared: Trait for shared param seed (default: this.getClass.getName.hashCode.toLong).
hasShuffleRead(StageData) - Static method in class org.apache.spark.ui.jobs.ApiHelper
hasShuffleWrite(StageData) - Static method in class org.apache.spark.ui.jobs.ApiHelper
hasShutdownDeleteDir(File) - Static method in class org.apache.spark.util.ShutdownHookManager
HasSolver - Interface in org.apache.spark.ml.param.shared: Trait for shared param solver.
HasStandardization - Interface in org.apache.spark.ml.param.shared: Trait for shared param standardization (default: true).
HasStepSize - Interface in org.apache.spark.ml.param.shared: Trait for shared param stepSize.
hasSubModels() - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
hasSubModels() - Method in class org.apache.spark.ml.tuning.TrainValidationSplitModel
hasSummary() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel: Indicates whether a training summary exists for this model instance.
hasSummary() - Method in class org.apache.spark.ml.clustering.BisectingKMeansModel: Return true if there exists summary of model.
hasSummary() - Method in class org.apache.spark.ml.clustering.GaussianMixtureModel: Return true if there exists summary of model.
hasSummary() - Method in class org.apache.spark.ml.clustering.KMeansModel: Return true if there exists summary of model.
hasSummary() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionModel: Indicates if summary is available.
hasSummary() - Method in class org.apache.spark.ml.regression.LinearRegressionModel: Indicates whether a training summary exists for this model instance.
HasThreshold - Interface in org.apache.spark.ml.param.shared: Trait for shared param threshold.
HasThresholds - Interface in org.apache.spark.ml.param.shared: Trait for shared param thresholds.
hasTimedOut() - Method in interface org.apache.spark.sql.streaming.GroupState: Whether the function has been called because the key has timed out.
HasTol - Interface in org.apache.spark.ml.param.shared: Trait for shared param tol.
HasValidationIndicatorCol - Interface in org.apache.spark.ml.param.shared: Trait for shared param validationIndicatorCol.
hasValue(String) - Method in class org.apache.spark.ml.attribute.NominalAttribute: Tests whether this attribute contains a specific value.
HasVarianceCol - Interface in org.apache.spark.ml.param.shared: Trait for shared param varianceCol.
HasWeightCol - Interface in org.apache.spark.ml.param.shared: Trait for shared param weightCol.
hasWeightCol() - Method in interface org.apache.spark.ml.regression.GeneralizedLinearRegressionBase: Checks whether weight column is set and nonempty.
hasWeightCol() - Method in interface org.apache.spark.ml.regression.IsotonicRegressionBase: Checks whether the input has weight column.
hasWriteObjectMethod() - Method in class org.apache.spark.serializer.SerializationDebugger.ObjectStreamClassMethods
hasWriteReplaceMethod() - Method in class org.apache.spark.serializer.SerializationDebugger.ObjectStreamClassMethods
HdfsUtils - Class in org.apache.spark.streaming.util
HdfsUtils() - Constructor for class org.apache.spark.streaming.util.HdfsUtils
head(int) - Method in class org.apache.spark.sql.Dataset: Returns the first n rows.
head() - Method in class org.apache.spark.sql.Dataset: Returns the first row.
HEADER_ACCUMULATORS() - Static method in class org.apache.spark.ui.jobs.ApiHelper
HEADER_ATTEMPT() - Static method in class org.apache.spark.ui.jobs.ApiHelper
HEADER_DESER_TIME() - Static method in class org.apache.spark.ui.jobs.ApiHelper
HEADER_DISK_SPILL() - Static method in class org.apache.spark.ui.jobs.ApiHelper
HEADER_DURATION() - Static method in class org.apache.spark.ui.jobs.ApiHelper
HEADER_ERROR() - Static method in class org.apache.spark.ui.jobs.ApiHelper
HEADER_EXECUTOR() - Static method in class org.apache.spark.ui.jobs.ApiHelper
HEADER_GC_TIME() - Static method in class org.apache.spark.ui.jobs.ApiHelper
HEADER_GETTING_RESULT_TIME() - Static method in class org.apache.spark.ui.jobs.ApiHelper
HEADER_HOST() - Static method in class org.apache.spark.ui.jobs.ApiHelper
HEADER_ID() - Static method in class org.apache.spark.ui.jobs.ApiHelper
HEADER_INPUT_SIZE() - Static method in class org.apache.spark.ui.jobs.ApiHelper
HEADER_LAUNCH_TIME() - Static method in class org.apache.spark.ui.jobs.ApiHelper
HEADER_LOCALITY() - Static method in class org.apache.spark.ui.jobs.ApiHelper
HEADER_MEM_SPILL() - Static method in class org.apache.spark.ui.jobs.ApiHelper
HEADER_OUTPUT_SIZE() - Static method in class org.apache.spark.ui.jobs.ApiHelper
HEADER_PEAK_MEM() - Static method in class org.apache.spark.ui.jobs.ApiHelper
HEADER_SCHEDULER_DELAY() - Static method in class org.apache.spark.ui.jobs.ApiHelper
HEADER_SER_TIME() - Static method in class org.apache.spark.ui.jobs.ApiHelper
HEADER_SHUFFLE_READ_TIME() - Static method in class org.apache.spark.ui.jobs.ApiHelper
HEADER_SHUFFLE_REMOTE_READS() - Static method in class org.apache.spark.ui.jobs.ApiHelper
HEADER_SHUFFLE_TOTAL_READS() - Static method in class org.apache.spark.ui.jobs.ApiHelper
HEADER_SHUFFLE_WRITE_SIZE() - Static method in class org.apache.spark.ui.jobs.ApiHelper
HEADER_SHUFFLE_WRITE_TIME() - Static method in class org.apache.spark.ui.jobs.ApiHelper
HEADER_STATUS() - Static method in class org.apache.spark.ui.jobs.ApiHelper
HEADER_TASK_INDEX() - Static method in class org.apache.spark.ui.jobs.ApiHelper
headers() - Method in interface org.apache.spark.ui.PagedTable
headerSparkPage(HttpServletRequest, String, Function0<Seq<Node>>, SparkUITab, Option<Object>, Option<String>, boolean, boolean) - Static method in class org.apache.spark.ui.UIUtils: Returns a spark page with correctly formatted headers
hex(Column) - Static method in class org.apache.spark.sql.functions: Computes hex value of the given column.
high() - Method in class org.apache.spark.partial.BoundedDouble
HingeGradient - Class in org.apache.spark.mllib.optimization: :: DeveloperApi :: Compute gradient and loss for a Hinge loss function, as used in SVM binary classification.
HingeGradient() - Constructor for class org.apache.spark.mllib.optimization.HingeGradient
hint(String, Object...) - Method in class org.apache.spark.sql.Dataset: Specifies some hint on the current Dataset.
hint(String, Seq<Object>) - Method in class org.apache.spark.sql.Dataset: Specifies some hint on the current Dataset.
histogram(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Compute a histogram of the data using bucketCount number of buckets evenly spaced between the minimum and maximum of the RDD.
histogram(double[]) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Compute a histogram using the provided buckets.
histogram(Double[], boolean) - Method in class org.apache.spark.api.java.JavaDoubleRDD
histogram(int) - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Compute a histogram of the data using bucketCount number of buckets evenly spaced between the minimum and maximum of the RDD.
histogram(double[], boolean) - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Compute a histogram using the provided buckets.
HIVE_GENERIC_UDF_MACRO_CLS() - Static method in class org.apache.spark.sql.hive.HiveShim
HIVE_METASTORE_BARRIER_PREFIXES() - Static method in class org.apache.spark.sql.hive.HiveUtils
HIVE_METASTORE_JARS() - Static method in class org.apache.spark.sql.hive.HiveUtils
HIVE_METASTORE_SHARED_PREFIXES() - Static method in class org.apache.spark.sql.hive.HiveUtils
HIVE_METASTORE_VERSION() - Static method in class org.apache.spark.sql.hive.HiveUtils
HIVE_THRIFT_SERVER_ASYNC() - Static method in class org.apache.spark.sql.hive.HiveUtils
HiveAnalysis - Class in org.apache.spark.sql.hive: Replaces generic operations with specific variants that are designed to work with Hive.
HiveAnalysis() - Constructor for class org.apache.spark.sql.hive.HiveAnalysis
HiveCatalogMetrics - Class in org.apache.spark.metrics.source: :: Experimental :: Metrics for access to the hive external catalog.
HiveCatalogMetrics() - Constructor for class org.apache.spark.metrics.source.HiveCatalogMetrics
HiveClient - Interface in org.apache.spark.sql.hive.client: An externally visible interface to the Hive client.
HiveContext - Class in org.apache.spark.sql.hive: Deprecated.
Use SparkSession.builder.enableHiveSupport instead. Since 2.0.0.
HiveContext(SparkContext) - Constructor for class org.apache.spark.sql.hive.HiveContext: Deprecated.
HiveContext(JavaSparkContext) - Constructor for class org.apache.spark.sql.hive.HiveContext: Deprecated.
HiveFileFormat - Class in org.apache.spark.sql.hive.execution: FileFormat for writing Hive tables.
HiveFileFormat(org.apache.spark.sql.hive.HiveShim.ShimFileSinkDesc) - Constructor for class org.apache.spark.sql.hive.execution.HiveFileFormat
HiveFileFormat() - Constructor for class org.apache.spark.sql.hive.execution.HiveFileFormat
HiveFunctionWrapper$() - Constructor for class org.apache.spark.sql.hive.HiveShim.HiveFunctionWrapper$
HiveInspectors - Interface in org.apache.spark.sql.hive: 1.
HiveInspectors.typeInfoConversions - Class in org.apache.spark.sql.hive
HiveOptions - Class in org.apache.spark.sql.hive.execution: Options for the Hive data source.
HiveOptions(CaseInsensitiveMap<String>) - Constructor for class org.apache.spark.sql.hive.execution.HiveOptions
HiveOptions(Map<String, String>) - Constructor for class org.apache.spark.sql.hive.execution.HiveOptions
HiveOutputWriter - Class in org.apache.spark.sql.hive.execution
HiveOutputWriter(String, org.apache.spark.sql.hive.HiveShim.ShimFileSinkDesc, JobConf, StructType) - Constructor for class org.apache.spark.sql.hive.execution.HiveOutputWriter
HiveScriptIOSchema - Class in org.apache.spark.sql.hive.execution
HiveScriptIOSchema(Seq<Tuple2<String, String>>, Seq<Tuple2<String, String>>, Option<String>, Option<String>, Seq<Tuple2<String, String>>, Seq<Tuple2<String, String>>, Option<String>, Option<String>, boolean) - Constructor for class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
HiveSessionResourceLoader - Class in org.apache.spark.sql.hive
HiveSessionResourceLoader(SparkSession, Function0<HiveClient>) - Constructor for class org.apache.spark.sql.hive.HiveSessionResourceLoader
HiveSessionStateBuilder - Class in org.apache.spark.sql.hive: Builder that produces a Hive-aware SessionState.
HiveSessionStateBuilder(SparkSession, Option<SessionState>) - Constructor for class org.apache.spark.sql.hive.HiveSessionStateBuilder
HiveShim - Class in org.apache.spark.sql.hive
HiveShim() - Constructor for class org.apache.spark.sql.hive.HiveShim
HiveShim.HiveFunctionWrapper$ - Class in org.apache.spark.sql.hive
HiveStrategies - Interface in org.apache.spark.sql.hive
HiveStrategies.HiveTableScans - Class in org.apache.spark.sql.hive: Retrieves data using a HiveTableScan.
HiveStrategies.HiveTableScans$ - Class in org.apache.spark.sql.hive: Retrieves data using a HiveTableScan.
HiveStrategies.Scripts - Class in org.apache.spark.sql.hive
HiveStrategies.Scripts$ - Class in org.apache.spark.sql.hive
HiveStringType - Class in org.apache.spark.sql.types: A hive string type for compatibility.
HiveStringType() - Constructor for class org.apache.spark.sql.types.HiveStringType
HiveTableScans() - Method in interface org.apache.spark.sql.hive.HiveStrategies
HiveTableScans() - Constructor for class org.apache.spark.sql.hive.HiveStrategies.HiveTableScans
HiveTableScans$() - Constructor for class org.apache.spark.sql.hive.HiveStrategies.HiveTableScans$
HiveTableUtil - Class in org.apache.spark.sql.hive
HiveTableUtil() - Constructor for class org.apache.spark.sql.hive.HiveTableUtil
HiveUDAFBuffer - Class in org.apache.spark.sql.hive
HiveUDAFBuffer(GenericUDAFEvaluator.AggregationBuffer, boolean) - Constructor for class org.apache.spark.sql.hive.HiveUDAFBuffer
HiveUtils - Class in org.apache.spark.sql.hive
HiveUtils() - Constructor for class org.apache.spark.sql.hive.HiveUtils
holdingLocks() - Method in class org.apache.spark.status.api.v1.ThreadStackTrace
horzcat(Matrix[]) - Static method in class org.apache.spark.ml.linalg.Matrices: Horizontally concatenate a sequence of matrices.
horzcat(Matrix[]) - Static method in class org.apache.spark.mllib.linalg.Matrices: Horizontally concatenate a sequence of matrices.
host() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillExecutorsOnHost
host() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveWorker
host() - Method in class org.apache.spark.scheduler.TaskInfo
host() - Method in interface org.apache.spark.scheduler.TaskLocation
host() - Method in interface org.apache.spark.SparkExecutorInfo
host() - Method in class org.apache.spark.SparkExecutorInfoImpl
host() - Method in class org.apache.spark.status.api.v1.TaskData
host() - Method in class org.apache.spark.status.LiveExecutor
HOST() - Static method in class org.apache.spark.status.TaskIndexNames
host() - Method in class org.apache.spark.storage.BlockManagerId
hostId() - Method in class org.apache.spark.scheduler.SparkListenerNodeBlacklisted
hostId() - Method in class org.apache.spark.scheduler.SparkListenerNodeBlacklistedForStage
hostId() - Method in class org.apache.spark.scheduler.SparkListenerNodeUnblacklisted
hostLocation() - Method in class org.apache.spark.scheduler.SplitInfo
hostname() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor
hostname() - Method in class org.apache.spark.status.LiveExecutor
hostPort() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
hostPort() - Method in class org.apache.spark.status.LiveExecutor
hostPort() - Method in class org.apache.spark.storage.BlockManagerId
hostToLocalTaskCount() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RequestExecutors
hour(Column) - Static method in class org.apache.spark.sql.functions: Extracts the hours as an integer from a given date/timestamp/string.
hours() - Static method in class org.apache.spark.scheduler.StatsReportListener
html() - Method in class org.apache.spark.status.api.v1.StackTrace
htmlResponderToServlet(Function1<HttpServletRequest, Seq<Node>>) - Static method in class org.apache.spark.ui.JettyUtils
httpRequest() - Method in interface org.apache.spark.status.api.v1.ApiRequestContext
httpResponseCode(URL, String, Seq<Tuple2<String, String>>) - Static method in class org.apache.spark.TestUtils: Returns the response code from an HTTP(S) URL.
hypot(Column, Column) - Static method in class org.apache.spark.sql.functions: Computes sqrt(a^2^ + b^2^) without intermediate overflow or underflow.
hypot(Column, String) - Static method in class org.apache.spark.sql.functions: Computes sqrt(a^2^ + b^2^) without intermediate overflow or underflow.
hypot(String, Column) - Static method in class org.apache.spark.sql.functions: Computes sqrt(a^2^ + b^2^) without intermediate overflow or underflow.
hypot(String, String) - Static method in class org.apache.spark.sql.functions: Computes sqrt(a^2^ + b^2^) without intermediate overflow or underflow.
hypot(Column, double) - Static method in class org.apache.spark.sql.functions: Computes sqrt(a^2^ + b^2^) without intermediate overflow or underflow.
hypot(String, double) - Static method in class org.apache.spark.sql.functions: Computes sqrt(a^2^ + b^2^) without intermediate overflow or underflow.
hypot(double, Column) - Static method in class org.apache.spark.sql.functions: Computes sqrt(a^2^ + b^2^) without intermediate overflow or underflow.
hypot(double, String) - Static method in class org.apache.spark.sql.functions: Computes sqrt(a^2^ + b^2^) without intermediate overflow or underflow.

I

i() - Method in class org.apache.spark.mllib.linalg.distributed.MatrixEntry
id() - Method in class org.apache.spark.Accumulable: Deprecated.
id() - Method in interface org.apache.spark.api.java.JavaRDDLike: A unique ID for this RDD (within its SparkContext).
id() - Method in class org.apache.spark.broadcast.Broadcast
id() - Method in class org.apache.spark.ml.tree.DecisionTreeModelReadWrite.NodeData
id() - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering.Assignment
id() - Method in class org.apache.spark.mllib.tree.model.Node
id() - Method in class org.apache.spark.rdd.RDD: A unique ID for this RDD (within its SparkContext).
id() - Method in class org.apache.spark.scheduler.AccumulableInfo
id() - Method in class org.apache.spark.scheduler.TaskInfo
id() - Method in interface org.apache.spark.sql.streaming.StreamingQuery: Returns the unique id of this query that persists across restarts from checkpoint data.
id() - Method in class org.apache.spark.sql.streaming.StreamingQueryListener.QueryStartedEvent
id() - Method in class org.apache.spark.sql.streaming.StreamingQueryListener.QueryTerminatedEvent
id() - Method in class org.apache.spark.sql.streaming.StreamingQueryProgress
id() - Method in class org.apache.spark.status.api.v1.AccumulableInfo
id() - Method in class org.apache.spark.status.api.v1.ApplicationInfo
id() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
id() - Method in class org.apache.spark.status.api.v1.RDDStorageInfo
id() - Method in class org.apache.spark.storage.RDDInfo
id() - Method in class org.apache.spark.streaming.dstream.InputDStream: This is a unique identifier for the input stream.
id() - Method in class org.apache.spark.streaming.scheduler.OutputOperationInfo
id() - Method in class org.apache.spark.util.AccumulatorV2: Returns the id of this accumulator, can only be called after registration.
Identifiable - Interface in org.apache.spark.ml.util: :: DeveloperApi ::
Identity$() - Constructor for class org.apache.spark.ml.regression.GeneralizedLinearRegression.Identity$
IDF - Class in org.apache.spark.ml.feature: Compute the Inverse Document Frequency (IDF) given a collection of documents.
IDF(String) - Constructor for class org.apache.spark.ml.feature.IDF
IDF() - Constructor for class org.apache.spark.ml.feature.IDF
idf() - Method in class org.apache.spark.ml.feature.IDFModel: Returns the IDF vector.
IDF - Class in org.apache.spark.mllib.feature: Inverse document frequency (IDF).
IDF(int) - Constructor for class org.apache.spark.mllib.feature.IDF
IDF() - Constructor for class org.apache.spark.mllib.feature.IDF
idf() - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator: Returns the current IDF vector.
idf() - Method in class org.apache.spark.mllib.feature.IDFModel
IDF.DocumentFrequencyAggregator - Class in org.apache.spark.mllib.feature: Document frequency aggregator.
IDFBase - Interface in org.apache.spark.ml.feature: Params for IDF and IDFModel.
IDFModel - Class in org.apache.spark.ml.feature: Model fitted by IDF.
IDFModel - Class in org.apache.spark.mllib.feature: Represents an IDF model that can transform term frequency vectors.
ifPartitionNotExists() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
ImageDataSource - Class in org.apache.spark.ml.source.image: image package implements Spark SQL data source API for loading image data as DataFrame.
ImageDataSource() - Constructor for class org.apache.spark.ml.source.image.ImageDataSource
imageFields() - Static method in class org.apache.spark.ml.image.ImageSchema
ImageSchema - Class in org.apache.spark.ml.image: :: Experimental :: Defines the image schema and methods to read and manipulate images.
ImageSchema() - Constructor for class org.apache.spark.ml.image.ImageSchema
imageSchema() - Static method in class org.apache.spark.ml.image.ImageSchema: DataFrame with a single column of images named "image" (nullable)
implicitPrefs() - Method in interface org.apache.spark.ml.recommendation.ALSParams: Param to decide whether to use implicit preference.
implicits() - Method in class org.apache.spark.sql.SparkSession: Accessor for nested Scala object
implicits() - Method in class org.apache.spark.sql.SQLContext: Accessor for nested Scala object
implicits$() - Constructor for class org.apache.spark.sql.SparkSession.implicits$
implicits$() - Constructor for class org.apache.spark.sql.SQLContext.implicits$
improveException(Object, NotSerializableException) - Static method in class org.apache.spark.serializer.SerializationDebugger: Improve the given NotSerializableException with the serialization path leading from the given object to the problematic object.
Impurities - Class in org.apache.spark.mllib.tree.impurity: Factory for Impurity instances.
Impurities() - Constructor for class org.apache.spark.mllib.tree.impurity.Impurities
impurity() - Method in class org.apache.spark.ml.tree.DecisionTreeModelReadWrite.NodeData
impurity() - Method in class org.apache.spark.ml.tree.InternalNode
impurity() - Method in class org.apache.spark.ml.tree.LeafNode
impurity() - Method in class org.apache.spark.ml.tree.Node: Impurity measure at this node (for training data)
impurity() - Method in interface org.apache.spark.ml.tree.TreeClassifierParams: Criterion used for information gain calculation (case-insensitive).
impurity() - Method in interface org.apache.spark.ml.tree.TreeRegressorParams: Criterion used for information gain calculation (case-insensitive).
impurity() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
Impurity - Interface in org.apache.spark.mllib.tree.impurity: Trait for calculating information gain.
impurity() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
impurity() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
impurity() - Method in class org.apache.spark.mllib.tree.model.Node
impurityStats() - Method in class org.apache.spark.ml.tree.DecisionTreeModelReadWrite.NodeData
Imputer - Class in org.apache.spark.ml.feature: :: Experimental :: Imputation estimator for completing missing values, either using the mean or the median of the columns in which the missing values are located.
Imputer(String) - Constructor for class org.apache.spark.ml.feature.Imputer
Imputer() - Constructor for class org.apache.spark.ml.feature.Imputer
ImputerModel - Class in org.apache.spark.ml.feature: :: Experimental :: Model fitted by Imputer.
ImputerParams - Interface in org.apache.spark.ml.feature: Params for Imputer and ImputerModel.
In() - Static method in class org.apache.spark.graphx.EdgeDirection: Edges arriving at a vertex.
In - Class in org.apache.spark.sql.sources: A filter that evaluates to true iff the attribute evaluates to one of the values in the array.
In(String, Object[]) - Constructor for class org.apache.spark.sql.sources.In
INACTIVE() - Static method in class org.apache.spark.streaming.scheduler.ReceiverState
inArray(Object) - Static method in class org.apache.spark.ml.param.ParamValidators: Check for value in an allowed set of values.
inArray(List<T>) - Static method in class org.apache.spark.ml.param.ParamValidators: Check for value in an allowed set of values.
InBlock$() - Constructor for class org.apache.spark.ml.recommendation.ALS.InBlock$
InboxMessage - Interface in org.apache.spark.rpc.netty
IncompatibleMergeException - Exception in org.apache.spark.util.sketch
IncompatibleMergeException(String) - Constructor for exception org.apache.spark.util.sketch.IncompatibleMergeException
incrementFetchedPartitions(int) - Static method in class org.apache.spark.metrics.source.HiveCatalogMetrics
incrementFileCacheHits(int) - Static method in class org.apache.spark.metrics.source.HiveCatalogMetrics
incrementFilesDiscovered(int) - Static method in class org.apache.spark.metrics.source.HiveCatalogMetrics
incrementHiveClientCalls(int) - Static method in class org.apache.spark.metrics.source.HiveCatalogMetrics
incrementParallelListingJobCount(int) - Static method in class org.apache.spark.metrics.source.HiveCatalogMetrics
inDegrees() - Method in class org.apache.spark.graphx.GraphOps: The in-degree of each vertex in the graph.
independence() - Method in class org.apache.spark.mllib.stat.test.ChiSqTest.NullHypothesis$
INDETERMINATE() - Static method in class org.apache.spark.rdd.DeterministicLevel
index() - Method in class org.apache.spark.ml.attribute.Attribute: Index of the attribute.
INDEX() - Static method in class org.apache.spark.ml.attribute.AttributeKeys
index() - Method in class org.apache.spark.ml.attribute.BinaryAttribute
index() - Method in class org.apache.spark.ml.attribute.NominalAttribute
index() - Method in class org.apache.spark.ml.attribute.NumericAttribute
index() - Static method in class org.apache.spark.ml.attribute.UnresolvedAttribute
index(int, int) - Method in interface org.apache.spark.ml.linalg.Matrix: Return the index for the (i, j)-th element in the backing array.
index() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRow
index(int, int) - Method in interface org.apache.spark.mllib.linalg.Matrix: Return the index for the (i, j)-th element in the backing array.
index() - Method in interface org.apache.spark.Partition: Get the partition's index within its parent RDD
index() - Method in class org.apache.spark.scheduler.TaskInfo: The index of this task within its task set.
index() - Method in class org.apache.spark.status.api.v1.TaskData
IndexedRow - Class in org.apache.spark.mllib.linalg.distributed: Represents a row of IndexedRowMatrix.
IndexedRow(long, Vector) - Constructor for class org.apache.spark.mllib.linalg.distributed.IndexedRow
IndexedRowMatrix - Class in org.apache.spark.mllib.linalg.distributed: Represents a row-oriented DistributedMatrix with indexed rows.
IndexedRowMatrix(RDD<IndexedRow>, long, int) - Constructor for class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
IndexedRowMatrix(RDD<IndexedRow>) - Constructor for class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix: Alternative constructor leaving matrix dimensions to be determined automatically.
indexName(String) - Static method in class org.apache.spark.ui.jobs.ApiHelper
indexOf(String) - Method in class org.apache.spark.ml.attribute.AttributeGroup: Index of an attribute specified by name.
indexOf(String) - Method in class org.apache.spark.ml.attribute.NominalAttribute: Index of a specific value.
indexOf(Object) - Method in class org.apache.spark.mllib.feature.HashingTF: Returns the index of the input term.
indexToLevel(int) - Static method in class org.apache.spark.mllib.tree.model.Node: Return the level of a tree which the given node is in.
IndexToString - Class in org.apache.spark.ml.feature: A Transformer that maps a column of indices back to a new column of corresponding string values.
IndexToString(String) - Constructor for class org.apache.spark.ml.feature.IndexToString
IndexToString() - Constructor for class org.apache.spark.ml.feature.IndexToString
indices() - Method in class org.apache.spark.ml.feature.VectorSlicer: An array of indices to select features from a vector column.
indices() - Method in class org.apache.spark.ml.linalg.SparseVector
indices() - Method in class org.apache.spark.mllib.linalg.SparseVector
inferSchema(SparkSession, Map<String, String>, Seq<FileStatus>) - Method in class org.apache.spark.sql.hive.execution.HiveFileFormat
inferSchema(CatalogTable) - Static method in class org.apache.spark.sql.hive.HiveUtils: Infers the schema for Hive serde tables and returns the CatalogTable with the inferred schema.
inferSchema(SparkSession, Map<String, String>, Seq<FileStatus>) - Method in class org.apache.spark.sql.hive.orc.OrcFileFormat
info() - Method in class org.apache.spark.status.LiveRDD
info() - Method in class org.apache.spark.status.LiveStage
info() - Method in class org.apache.spark.status.LiveTask
infoChanged(SparkAppHandle) - Method in interface org.apache.spark.launcher.SparkAppHandle.Listener: Callback for changes in any information that is not the handle's state.
infoGain() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
InformationGainStats - Class in org.apache.spark.mllib.tree.model: :: DeveloperApi :: Information gain statistics for each split param: gain information gain value param: impurity current node impurity param: leftImpurity left node impurity param: rightImpurity right node impurity param: leftPredict left node predict param: rightPredict right node predict
InformationGainStats(double, double, double, double, Predict, Predict) - Constructor for class org.apache.spark.mllib.tree.model.InformationGainStats
init() - Method in interface org.apache.spark.ExecutorPlugin: Initialize the executor plugin.
initcap(Column) - Static method in class org.apache.spark.sql.functions: Returns a new string column by converting the first letter of each word to uppercase.
initDaemon(Logger) - Static method in class org.apache.spark.util.Utils: Utility function that should be called early in main() for daemons to set up some common diagnostic state.
initHadoopOutputMetrics(TaskContext) - Static method in class org.apache.spark.internal.io.SparkHadoopWriterUtils
initialHash() - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer
initialize(boolean, SparkConf, org.apache.spark.SecurityManager) - Method in interface org.apache.spark.broadcast.BroadcastFactory
initialize(double, double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Binomial$
initialize(double, double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Gamma$
initialize(double, double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Gaussian$
initialize(double, double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Poisson$
initialize(RDD<Tuple2<Object, Vector>>, LDA) - Method in interface org.apache.spark.mllib.clustering.LDAOptimizer: Initializer for the optimizer.
initialize() - Static method in class org.apache.spark.rdd.InputFileBlockHolder: Initializes thread local by explicitly getting the value.
initialize(TaskScheduler, SchedulerBackend) - Method in interface org.apache.spark.scheduler.ExternalClusterManager: Initialize task scheduler and backend scheduler.
initialize(MutableAggregationBuffer) - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction: Initializes the given aggregation buffer, i.e.
Initialized() - Static method in class org.apache.spark.rdd.CheckpointState
initializeLogging(boolean, boolean) - Method in interface org.apache.spark.internal.Logging
initializeLogIfNecessary(boolean) - Method in interface org.apache.spark.internal.Logging
initializeLogIfNecessary(boolean, boolean) - Method in interface org.apache.spark.internal.Logging
initialState(RDD<Tuple2<KeyType, StateType>>) - Method in class org.apache.spark.streaming.StateSpec: Set the RDD containing the initial states that will be used by mapWithState
initialState(JavaPairRDD<KeyType, StateType>) - Method in class org.apache.spark.streaming.StateSpec: Set the RDD containing the initial states that will be used by mapWithState
initialValue() - Method in class org.apache.spark.partial.PartialResult
initialWeights() - Method in interface org.apache.spark.ml.classification.MultilayerPerceptronParams: The initial weights of the model.
initInputSerDe(Seq<Expression>) - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
initMode() - Method in interface org.apache.spark.ml.clustering.KMeansParams: Param for the initialization algorithm.
initMode() - Method in interface org.apache.spark.ml.clustering.PowerIterationClusteringParams: Param for the initialization algorithm.
initModel(DenseVector<Object>, Random) - Method in interface org.apache.spark.ml.ann.Layer: Returns the instance of the layer with random generated weights.
initOutputFormat(JobContext) - Method in class org.apache.spark.internal.io.HadoopWriteConfigUtil
initOutputSerDe(Seq<Attribute>) - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
initSteps() - Method in interface org.apache.spark.ml.clustering.KMeansParams: Param for the number of steps for the k-means|| initialization mode.
initWriter(TaskAttemptContext, int) - Method in class org.apache.spark.internal.io.HadoopWriteConfigUtil
injectCheckRule(Function1<SparkSession, Function1<LogicalPlan, BoxedUnit>>) - Method in class org.apache.spark.sql.SparkSessionExtensions: Inject an check analysis Rule builder into the SparkSession.
injectOptimizerRule(Function1<SparkSession, Rule<LogicalPlan>>) - Method in class org.apache.spark.sql.SparkSessionExtensions: Inject an optimizer Rule builder into the SparkSession.
injectParser(Function2<SparkSession, ParserInterface, ParserInterface>) - Method in class org.apache.spark.sql.SparkSessionExtensions: Inject a custom parser into the SparkSession.
injectPlannerStrategy(Function1<SparkSession, SparkStrategy>) - Method in class org.apache.spark.sql.SparkSessionExtensions: Inject a planner Strategy builder into the SparkSession.
injectPostHocResolutionRule(Function1<SparkSession, Rule<LogicalPlan>>) - Method in class org.apache.spark.sql.SparkSessionExtensions: Inject an analyzer Rule builder into the SparkSession.
injectResolutionRule(Function1<SparkSession, Rule<LogicalPlan>>) - Method in class org.apache.spark.sql.SparkSessionExtensions: Inject an analyzer resolution Rule builder into the SparkSession.
InnerClosureFinder - Class in org.apache.spark.util
InnerClosureFinder(Set<Class<?>>) - Constructor for class org.apache.spark.util.InnerClosureFinder
innerJoin(EdgeRDD<ED2>, Function4<Object, Object, ED, ED2, ED3>, ClassTag<ED2>, ClassTag<ED3>) - Method in class org.apache.spark.graphx.EdgeRDD: Inner joins this EdgeRDD with another EdgeRDD, assuming both are partitioned using the same PartitionStrategy.
innerJoin(EdgeRDD<ED2>, Function4<Object, Object, ED, ED2, ED3>, ClassTag<ED2>, ClassTag<ED3>) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
innerJoin(RDD<Tuple2<Object, U>>, Function3<Object, VD, U, VD2>, ClassTag, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
innerJoin(RDD<Tuple2<Object, U>>, Function3<Object, VD, U, VD2>, ClassTag, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD: Inner joins this VertexRDD with an RDD containing vertex attribute pairs.
innerZipJoin(VertexRDD, Function3<Object, VD, U, VD2>, ClassTag, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
innerZipJoin(VertexRDD, Function3<Object, VD, U, VD2>, ClassTag, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD: Efficiently inner joins this VertexRDD with another VertexRDD sharing the same index.
inPlace() - Method in interface org.apache.spark.ml.ann.Layer: If true, the memory is not allocated for the output of this layer.
InProcessLauncher - Class in org.apache.spark.launcher: In-process launcher for Spark applications.
InProcessLauncher() - Constructor for class org.apache.spark.launcher.InProcessLauncher
input() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformationExec
INPUT() - Static method in class org.apache.spark.ui.ToolTips
input$() - Constructor for class org.apache.spark.InternalAccumulator.input$
input_file_name() - Static method in class org.apache.spark.sql.functions: Creates a string column for the file name of the current Spark task.
INPUT_FORMAT() - Static method in class org.apache.spark.sql.hive.execution.HiveOptions
INPUT_METRICS_PREFIX() - Static method in class org.apache.spark.InternalAccumulator
INPUT_RECORDS() - Static method in class org.apache.spark.status.TaskIndexNames
INPUT_SIZE() - Static method in class org.apache.spark.status.TaskIndexNames
inputBytes() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
inputBytes() - Method in class org.apache.spark.status.api.v1.StageData
inputCol() - Method in interface org.apache.spark.ml.param.shared.HasInputCol: Param for input column name.
inputCols() - Method in interface org.apache.spark.ml.param.shared.HasInputCols: Param for input column names.
inputDStream() - Method in class org.apache.spark.streaming.api.java.JavaInputDStream
inputDStream() - Method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
InputDStream<T> - Class in org.apache.spark.streaming.dstream: This is the abstract base class for all input streams.
InputDStream(StreamingContext, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.InputDStream
InputFileBlockHolder - Class in org.apache.spark.rdd: This holds file names of the current Spark task.
InputFileBlockHolder() - Constructor for class org.apache.spark.rdd.InputFileBlockHolder
inputFiles() - Method in class org.apache.spark.sql.Dataset: Returns a best-effort snapshot of the files that compose this Dataset.
inputFormat() - Method in class org.apache.spark.sql.hive.execution.HiveOptions
inputFormatClazz() - Method in class org.apache.spark.scheduler.InputFormatInfo
inputFormatClazz() - Method in class org.apache.spark.scheduler.SplitInfo
InputFormatInfo - Class in org.apache.spark.scheduler: :: DeveloperApi :: Parses and holds information about inputFormat (and files) specified as a parameter.
InputFormatInfo(Configuration, Class<?>, String) - Constructor for class org.apache.spark.scheduler.InputFormatInfo
InputMetricDistributions - Class in org.apache.spark.status.api.v1
InputMetrics - Class in org.apache.spark.status.api.v1
inputMetrics() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
inputMetrics() - Method in class org.apache.spark.status.api.v1.TaskMetrics
InputPartition<T> - Interface in org.apache.spark.sql.sources.v2.reader: An input partition returned by DataSourceReader.planInputPartitions() and is responsible for creating the actual data reader of one RDD partition.
InputPartitionReader<T> - Interface in org.apache.spark.sql.sources.v2.reader: An input partition reader returned by InputPartition.createPartitionReader() and is responsible for outputting data for a RDD partition.
inputRecords() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
inputRecords() - Method in class org.apache.spark.status.api.v1.StageData
inputRowFormat() - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
inputRowFormatMap() - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
inputRowsPerSecond() - Method in class org.apache.spark.sql.streaming.SourceProgress
inputRowsPerSecond() - Method in class org.apache.spark.sql.streaming.StreamingQueryProgress: The aggregate (across all sources) rate of data arriving.
inputSchema() - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction: A StructType represents data types of input arguments of this aggregate function.
inputSerdeClass() - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
inputSerdeProps() - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
inputSize() - Method in class org.apache.spark.status.api.v1.streaming.BatchInfo
inputStreamId() - Method in class org.apache.spark.streaming.scheduler.StreamInputInfo
inputTypes() - Method in class org.apache.spark.sql.expressions.UserDefinedFunction
inRange(double, double, boolean, boolean) - Static method in class org.apache.spark.ml.param.ParamValidators: Check for value in range lowerBound to upperBound.
inRange(double, double) - Static method in class org.apache.spark.ml.param.ParamValidators: Version of `inRange()` which uses inclusive be default: [lowerBound, upperBound]
insert(Dataset<Row>, boolean) - Method in interface org.apache.spark.sql.sources.InsertableRelation
InsertableRelation - Interface in org.apache.spark.sql.sources: A BaseRelation that can be used to insert data into it through the insert method.
insertInto(String) - Method in class org.apache.spark.sql.DataFrameWriter: Inserts the content of the DataFrame to the specified table.
InsertIntoHiveDirCommand - Class in org.apache.spark.sql.hive.execution: Command for writing the results of query to file system.
InsertIntoHiveDirCommand(boolean, CatalogStorageFormat, LogicalPlan, boolean, Seq<String>) - Constructor for class org.apache.spark.sql.hive.execution.InsertIntoHiveDirCommand
InsertIntoHiveTable - Class in org.apache.spark.sql.hive.execution: Command for writing data out to a Hive table.
InsertIntoHiveTable(CatalogTable, Map<String, Option<String>>, LogicalPlan, boolean, boolean, Seq<String>) - Constructor for class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
inShutdown() - Static method in class org.apache.spark.util.ShutdownHookManager: Detect whether this thread might be executing a shutdown hook.
inspectorToDataType(ObjectInspector) - Method in interface org.apache.spark.sql.hive.HiveInspectors
inspectorToDataType(ObjectInspector) - Static method in class org.apache.spark.sql.hive.orc.OrcFileFormat
instance() - Static method in class org.apache.spark.mllib.tree.impurity.Entropy: Get this impurity instance.
instance() - Static method in class org.apache.spark.mllib.tree.impurity.Gini: Get this impurity instance.
instance() - Static method in class org.apache.spark.mllib.tree.impurity.Variance: Get this impurity instance.
INSTANCE - Static variable in class org.apache.spark.serializer.DummySerializerInstance
instantiate(String, String, String, boolean) - Static method in class org.apache.spark.internal.io.FileCommitProtocol: Instantiates a FileCommitProtocol using the given className.
instr(Column, String) - Static method in class org.apache.spark.sql.functions: Locate the position of the first occurrence of substr column in the given string.
INT() - Static method in class org.apache.spark.sql.Encoders: An encoder for nullable int type.
intAccumulator(int) - Method in class org.apache.spark.api.java.JavaSparkContext: Deprecated.
use sc().longAccumulator(). Since 2.0.0.
intAccumulator(int, String) - Method in class org.apache.spark.api.java.JavaSparkContext: Deprecated.
use sc().longAccumulator(String). Since 2.0.0.
IntAccumulatorParam$() - Constructor for class org.apache.spark.AccumulatorParam.IntAccumulatorParam$: Deprecated.
IntArrayParam - Class in org.apache.spark.ml.param: :: DeveloperApi :: Specialized version of Param[Array[Int} for Java.
IntArrayParam(Params, String, String, Function1<int[], Object>) - Constructor for class org.apache.spark.ml.param.IntArrayParam
IntArrayParam(Params, String, String) - Constructor for class org.apache.spark.ml.param.IntArrayParam
IntegerType - Static variable in class org.apache.spark.sql.types.DataTypes: Gets the IntegerType object.
IntegerType - Class in org.apache.spark.sql.types: The data type representing Int values.
IntegerType() - Constructor for class org.apache.spark.sql.types.IntegerType
INTER_JOB_WAIT_MS() - Static method in class org.apache.spark.ui.UIWorkloadGenerator
InteractableTerm - Interface in org.apache.spark.ml.feature: A term that may be part of an interaction, e.g.
Interaction - Class in org.apache.spark.ml.feature: Implements the feature interaction transform.
Interaction(String) - Constructor for class org.apache.spark.ml.feature.Interaction
Interaction() - Constructor for class org.apache.spark.ml.feature.Interaction
intercept() - Method in class org.apache.spark.ml.classification.LinearSVCModel
intercept() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel: The model intercept for "binomial" logistic regression.
intercept() - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
intercept() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionModel
intercept() - Method in class org.apache.spark.ml.regression.LinearRegressionModel
intercept() - Method in class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$.Data
intercept() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
intercept() - Method in class org.apache.spark.mllib.classification.SVMModel
intercept() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
intercept() - Method in class org.apache.spark.mllib.regression.impl.GLMRegressionModel.SaveLoadV1_0$.Data
intercept() - Method in class org.apache.spark.mllib.regression.LassoModel
intercept() - Method in class org.apache.spark.mllib.regression.LinearRegressionModel
intercept() - Method in class org.apache.spark.mllib.regression.RidgeRegressionModel
interceptVector() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
intermediateStorageLevel() - Method in interface org.apache.spark.ml.recommendation.ALSParams: Param for StorageLevel for intermediate datasets.
InternalAccumulator - Class in org.apache.spark: A collection of fields and methods concerned with internal accumulators that represent task level metrics.
InternalAccumulator() - Constructor for class org.apache.spark.InternalAccumulator
InternalAccumulator.input$ - Class in org.apache.spark
InternalAccumulator.output$ - Class in org.apache.spark
InternalAccumulator.shuffleRead$ - Class in org.apache.spark
InternalAccumulator.shuffleWrite$ - Class in org.apache.spark
InternalKMeansModelWriter - Class in org.apache.spark.ml.clustering: A writer for KMeans that handles the "internal" (or default) format
InternalKMeansModelWriter() - Constructor for class org.apache.spark.ml.clustering.InternalKMeansModelWriter
InternalLinearRegressionModelWriter - Class in org.apache.spark.ml.regression: A writer for LinearRegression that handles the "internal" (or default) format
InternalLinearRegressionModelWriter() - Constructor for class org.apache.spark.ml.regression.InternalLinearRegressionModelWriter
InternalNode - Class in org.apache.spark.ml.tree: Internal Decision Tree node.
InterruptibleIterator<T> - Class in org.apache.spark: :: DeveloperApi :: An iterator that wraps around an existing iterator to provide task killing functionality.
InterruptibleIterator(TaskContext, Iterator<T>) - Constructor for class org.apache.spark.InterruptibleIterator
interruptThread() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillTask
interruptThread() - Method in class org.apache.spark.scheduler.local.KillTask
intersect(Dataset<T>) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset containing rows only in both this Dataset and another Dataset.
intersectAll(Dataset<T>) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset containing rows only in both this Dataset and another Dataset while preserving the duplicates.
intersection(JavaDoubleRDD) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return the intersection of this RDD and another one.
intersection(JavaPairRDD<K, V>) - Method in class org.apache.spark.api.java.JavaPairRDD: Return the intersection of this RDD and another one.
intersection(JavaRDD<T>) - Method in class org.apache.spark.api.java.JavaRDD: Return the intersection of this RDD and another one.
intersection(RDD<T>) - Method in class org.apache.spark.rdd.RDD: Return the intersection of this RDD and another one.
intersection(RDD<T>, Partitioner, Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Return the intersection of this RDD and another one.
intersection(RDD<T>, int) - Method in class org.apache.spark.rdd.RDD: Return the intersection of this RDD and another one.
intervalMs() - Method in class org.apache.spark.sql.streaming.ProcessingTime: Deprecated.
IntParam - Class in org.apache.spark.ml.param: :: DeveloperApi :: Specialized version of Param[Int] for Java.
IntParam(String, String, String, Function1<Object, Object>) - Constructor for class org.apache.spark.ml.param.IntParam
IntParam(String, String, String) - Constructor for class org.apache.spark.ml.param.IntParam
IntParam(Identifiable, String, String, Function1<Object, Object>) - Constructor for class org.apache.spark.ml.param.IntParam
IntParam(Identifiable, String, String) - Constructor for class org.apache.spark.ml.param.IntParam
IntParam - Class in org.apache.spark.util: An extractor object for parsing strings into integers.
IntParam() - Constructor for class org.apache.spark.util.IntParam
invalidateSerializedMapOutputStatusCache() - Method in class org.apache.spark.ShuffleStatus: Clears the cached serialized map output statuses.
inverse() - Method in class org.apache.spark.ml.feature.DCT: Indicates whether to perform the inverse DCT (true) or forward DCT (false).
inverse(double[], int) - Static method in class org.apache.spark.mllib.linalg.CholeskyDecomposition: Computes the inverse of a real symmetric positive definite matrix A using the Cholesky factorization A = U**T*U.
Inverse$() - Constructor for class org.apache.spark.ml.regression.GeneralizedLinearRegression.Inverse$
invokedMethod(Object, Class<?>, String) - Static method in class org.apache.spark.graphx.util.BytecodeUtils: Test whether the given closure invokes the specified method in the specified class.
invokeWriteReplace(Object) - Method in class org.apache.spark.serializer.SerializationDebugger.ObjectStreamClassMethods
ioEncryptionKey() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.SparkAppConfig
ioschema() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformationExec
is32BitDecimalType(DataType) - Static method in class org.apache.spark.sql.types.DecimalType: Returns if dt is a DecimalType that fits inside an int
is64BitDecimalType(DataType) - Static method in class org.apache.spark.sql.types.DecimalType: Returns if dt is a DecimalType that fits inside a long
isActive() - Method in interface org.apache.spark.sql.streaming.StreamingQuery: Returns true if this query is actively running.
isActive() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
isActive() - Method in class org.apache.spark.status.api.v1.streaming.ReceiverInfo
isActive() - Method in class org.apache.spark.status.LiveExecutor
isAddIntercept() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm: Get if the algorithm uses addIntercept
isAllowed(Enumeration.Value, Enumeration.Value) - Static method in class org.apache.spark.scheduler.TaskLocality
isBatchingEnabled(SparkConf, boolean) - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
isBindCollision(Throwable) - Static method in class org.apache.spark.util.Utils: Return whether the exception is caused by an address-port collision when binding.
isBlacklisted() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
isBlacklisted() - Method in class org.apache.spark.status.LiveExecutor
isBlacklisted() - Method in class org.apache.spark.status.LiveExecutorStageSummary
isBlacklistedForStage() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
isBroadcast() - Method in class org.apache.spark.storage.BlockId
isBucket() - Method in class org.apache.spark.sql.catalog.Column
isByteArrayDecimalType(DataType) - Static method in class org.apache.spark.sql.types.DecimalType: Returns if dt is a DecimalType that doesn't fit inside a long
isCached(String) - Method in class org.apache.spark.sql.catalog.Catalog: Returns true if the table is currently cached in-memory.
isCached(String) - Method in class org.apache.spark.sql.SQLContext: Returns true if the table is currently cached in-memory.
isCached() - Method in class org.apache.spark.storage.BlockStatus
isCached() - Method in class org.apache.spark.storage.RDDInfo
isCancelled() - Method in class org.apache.spark.ComplexFutureAction
isCancelled() - Method in interface org.apache.spark.FutureAction: Returns whether the action has been cancelled.
isCancelled() - Method in class org.apache.spark.SimpleFutureAction
isCascadingTruncateTable() - Method in class org.apache.spark.sql.jdbc.AggregatedDialect
isCascadingTruncateTable() - Static method in class org.apache.spark.sql.jdbc.DB2Dialect
isCascadingTruncateTable() - Static method in class org.apache.spark.sql.jdbc.DerbyDialect
isCascadingTruncateTable() - Method in class org.apache.spark.sql.jdbc.JdbcDialect: Return Some[true] iff TRUNCATE TABLE causes cascading default.
isCascadingTruncateTable() - Static method in class org.apache.spark.sql.jdbc.MsSqlServerDialect
isCascadingTruncateTable() - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
isCascadingTruncateTable() - Static method in class org.apache.spark.sql.jdbc.NoopDialect
isCascadingTruncateTable() - Static method in class org.apache.spark.sql.jdbc.OracleDialect
isCascadingTruncateTable() - Static method in class org.apache.spark.sql.jdbc.PostgresDialect
isCascadingTruncateTable() - Static method in class org.apache.spark.sql.jdbc.TeradataDialect
isCheckpointed() - Method in interface org.apache.spark.api.java.JavaRDDLike: Return whether this RDD has been checkpointed or not
isCheckpointed() - Method in class org.apache.spark.graphx.Graph: Return whether this Graph has been checkpointed or not.
isCheckpointed() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
isCheckpointed() - Method in class org.apache.spark.graphx.impl.GraphImpl
isCheckpointed() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
isCheckpointed() - Method in class org.apache.spark.rdd.RDD: Return whether this RDD is checkpointed and materialized, either reliably or locally.
isCliSessionState() - Static method in class org.apache.spark.sql.hive.HiveUtils: Check current Thread's SessionState type
isColMajor() - Method in interface org.apache.spark.ml.linalg.Matrix: Indicates whether the values backing this matrix are arranged in column major order.
isCompatible(BloomFilter) - Method in class org.apache.spark.util.sketch.BloomFilter: Determines whether a given bloom filter is compatible with this bloom filter.
isCompleted() - Method in class org.apache.spark.BarrierTaskContext
isCompleted() - Method in class org.apache.spark.ComplexFutureAction
isCompleted() - Method in interface org.apache.spark.FutureAction: Returns whether the action has already been completed with a value or an exception.
isCompleted() - Method in class org.apache.spark.SimpleFutureAction
isCompleted() - Method in class org.apache.spark.TaskContext: Returns true if the task has completed.
isDataAvailable() - Method in class org.apache.spark.sql.streaming.StreamingQueryStatus
isDefined(Param<?>) - Method in interface org.apache.spark.ml.param.Params: Checks whether a param is explicitly set or has a default value.
isDistributed() - Method in class org.apache.spark.ml.clustering.DistributedLDAModel
isDistributed() - Method in class org.apache.spark.ml.clustering.LDAModel: Indicates whether this instance is of type DistributedLDAModel
isDistributed() - Method in class org.apache.spark.ml.clustering.LocalLDAModel
isDriver() - Method in class org.apache.spark.storage.BlockManagerId
isDynamicAllocationEnabled(SparkConf) - Static method in class org.apache.spark.util.Utils: Return whether dynamic allocation is enabled in the given conf.
isEmpty() - Method in interface org.apache.spark.api.java.JavaRDDLike
isEmpty() - Method in class org.apache.spark.rdd.RDD
isEmpty() - Method in class org.apache.spark.sql.Dataset: Returns true if the Dataset is empty.
isExecutorStartupConf(String) - Static method in class org.apache.spark.SparkConf: Return whether the given config should be passed to an executor on start-up.
isExperiment() - Method in class org.apache.spark.mllib.stat.test.BinarySample
isFailed(Enumeration.Value) - Static method in class org.apache.spark.TaskState
isFatalError(Throwable) - Static method in class org.apache.spark.util.Utils: Returns true if the given exception was fatal.
isFile(Path) - Static method in class org.apache.spark.ml.image.SamplePathFilter
isFinal() - Method in enum org.apache.spark.launcher.SparkAppHandle.State: Whether this state is a final state, meaning the application is not running anymore once it's reached.
isFinished(Enumeration.Value) - Static method in class org.apache.spark.TaskState
isIgnorableException(Throwable) - Method in interface org.apache.spark.util.ListenerBus: Allows bus implementations to prevent error logging for certain exceptions.
isin(Object...) - Method in class org.apache.spark.sql.Column: A boolean expression that is evaluated to true if the value of this expression is contained by the evaluated values of the arguments.
isin(Seq<Object>) - Method in class org.apache.spark.sql.Column: A boolean expression that is evaluated to true if the value of this expression is contained by the evaluated values of the arguments.
isInCollection(Iterable<?>) - Method in class org.apache.spark.sql.Column: A boolean expression that is evaluated to true if the value of this expression is contained by the provided collection.
isInCollection(Iterable<?>) - Method in class org.apache.spark.sql.Column: A boolean expression that is evaluated to true if the value of this expression is contained by the provided collection.
isInDirectory(File, File) - Static method in class org.apache.spark.util.Utils: Return whether the specified file is a parent directory of the child file.
isInitialValueFinal() - Method in class org.apache.spark.partial.PartialResult
isInterrupted() - Method in class org.apache.spark.BarrierTaskContext
isInterrupted() - Method in class org.apache.spark.TaskContext: Returns true if the task has been killed.
isLargerBetter() - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
isLargerBetter() - Method in class org.apache.spark.ml.evaluation.ClusteringEvaluator
isLargerBetter() - Method in class org.apache.spark.ml.evaluation.Evaluator: Indicates whether the metric returned by evaluate should be maximized (true, default) or minimized (false).
isLargerBetter() - Method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
isLargerBetter() - Method in class org.apache.spark.ml.evaluation.RegressionEvaluator
isLeaf() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
isLeaf() - Method in class org.apache.spark.mllib.tree.model.Node
isLeftChild(int) - Static method in class org.apache.spark.mllib.tree.model.Node: Returns true if this is a left child.
isLocal() - Method in class org.apache.spark.api.java.JavaSparkContext
isLocal() - Method in class org.apache.spark.SparkContext
isLocal() - Method in class org.apache.spark.sql.Dataset: Returns true if the collect and take methods can be run locally (without any Spark executors).
isLocal() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveDirCommand
isLocalMaster(SparkConf) - Static method in class org.apache.spark.util.Utils
isMac() - Static method in class org.apache.spark.util.Utils: Whether the underlying operating system is Mac OS X.
isModifiable(String) - Method in class org.apache.spark.sql.RuntimeConfig: Indicates whether the configuration property with the given key is modifiable in the current session.
isMulticlassClassification() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
isMulticlassWithCategoricalFeatures() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
isMultipleOf(Duration) - Method in class org.apache.spark.streaming.Duration
isMultipleOf(Duration) - Method in class org.apache.spark.streaming.Time
isNaN() - Method in class org.apache.spark.sql.Column: True if the current expression is NaN.
isnan(Column) - Static method in class org.apache.spark.sql.functions: Return true iff the column is NaN.
isNominal() - Method in class org.apache.spark.ml.attribute.Attribute: Tests whether this attribute is nominal, true for NominalAttribute and BinaryAttribute.
isNominal() - Method in class org.apache.spark.ml.attribute.BinaryAttribute
isNominal() - Method in class org.apache.spark.ml.attribute.NominalAttribute
isNominal() - Method in class org.apache.spark.ml.attribute.NumericAttribute
isNominal() - Static method in class org.apache.spark.ml.attribute.UnresolvedAttribute
isNotNull() - Method in class org.apache.spark.sql.Column: True if the current expression is NOT null.
IsNotNull - Class in org.apache.spark.sql.sources: A filter that evaluates to true iff the attribute evaluates to a non-null value.
IsNotNull(String) - Constructor for class org.apache.spark.sql.sources.IsNotNull
isNull() - Method in class org.apache.spark.sql.Column: True if the current expression is null.
isnull(Column) - Static method in class org.apache.spark.sql.functions: Return true iff the column is null.
IsNull - Class in org.apache.spark.sql.sources: A filter that evaluates to true iff the attribute evaluates to null.
IsNull(String) - Constructor for class org.apache.spark.sql.sources.IsNull
isNullAt(int) - Method in interface org.apache.spark.sql.Row: Checks whether the value at position i is null.
isNullAt(int) - Method in class org.apache.spark.sql.vectorized.ArrowColumnVector
isNullAt(int) - Method in class org.apache.spark.sql.vectorized.ColumnarArray
isNullAt(int) - Method in class org.apache.spark.sql.vectorized.ColumnarRow
isNullAt(int) - Method in class org.apache.spark.sql.vectorized.ColumnVector: Returns whether the value at rowId is NULL.
isNumeric() - Method in class org.apache.spark.ml.attribute.Attribute: Tests whether this attribute is numeric, true for NumericAttribute and BinaryAttribute.
isNumeric() - Method in class org.apache.spark.ml.attribute.BinaryAttribute
isNumeric() - Method in class org.apache.spark.ml.attribute.NominalAttribute
isNumeric() - Method in class org.apache.spark.ml.attribute.NumericAttribute
isNumeric() - Static method in class org.apache.spark.ml.attribute.UnresolvedAttribute
isOpen() - Method in class org.apache.spark.security.CryptoStreamUtils.ErrorHandlingReadableChannel
isOpen() - Method in class org.apache.spark.storage.CountingWritableChannel
isOrdinal() - Method in class org.apache.spark.ml.attribute.NominalAttribute
isotonic() - Method in interface org.apache.spark.ml.regression.IsotonicRegressionBase: Param for whether the output sequence should be isotonic/increasing (true) or antitonic/decreasing (false).
isotonic() - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
IsotonicRegression - Class in org.apache.spark.ml.regression: Isotonic regression.
IsotonicRegression(String) - Constructor for class org.apache.spark.ml.regression.IsotonicRegression
IsotonicRegression() - Constructor for class org.apache.spark.ml.regression.IsotonicRegression
IsotonicRegression - Class in org.apache.spark.mllib.regression: Isotonic regression.
IsotonicRegression() - Constructor for class org.apache.spark.mllib.regression.IsotonicRegression: Constructs IsotonicRegression instance with default parameter isotonic = true.
IsotonicRegressionBase - Interface in org.apache.spark.ml.regression: Params for isotonic regression.
IsotonicRegressionModel - Class in org.apache.spark.ml.regression: Model fitted by IsotonicRegression.
IsotonicRegressionModel - Class in org.apache.spark.mllib.regression: Regression model for isotonic regression.
IsotonicRegressionModel(double[], double[], boolean) - Constructor for class org.apache.spark.mllib.regression.IsotonicRegressionModel
IsotonicRegressionModel(Iterable<Object>, Iterable<Object>, Boolean) - Constructor for class org.apache.spark.mllib.regression.IsotonicRegressionModel: A Java-friendly constructor that takes two Iterable parameters and one Boolean parameter.
isOutputSpecValidationEnabled(SparkConf) - Static method in class org.apache.spark.internal.io.SparkHadoopWriterUtils
isPartition() - Method in class org.apache.spark.sql.catalog.Column
isPresent() - Method in class org.apache.spark.api.java.Optional
isRDD() - Method in class org.apache.spark.storage.BlockId
isReady() - Method in interface org.apache.spark.scheduler.SchedulerBackend
isRegistered() - Method in class org.apache.spark.util.AccumulatorV2: Returns true if this accumulator has been registered.
isRInstalled() - Static method in class org.apache.spark.api.r.RUtils: Check if R is installed before running tests that use R commands.
isRowMajor() - Method in interface org.apache.spark.ml.linalg.Matrix: Indicates whether the values backing this matrix are arranged in row major order.
isRunningLocally() - Method in class org.apache.spark.BarrierTaskContext
isRunningLocally() - Method in class org.apache.spark.TaskContext: Deprecated.
Local execution was removed, so this always returns false. Since 2.0.0.
isSet(Param<?>) - Method in interface org.apache.spark.ml.param.Params: Checks whether a param is explicitly set.
isShuffle() - Method in class org.apache.spark.storage.BlockId
isSparkPortConf(String) - Static method in class org.apache.spark.SparkConf: Return true if the given config matches either spark.*.port or spark.port.*.
isSparkRInstalled() - Static method in class org.apache.spark.api.r.RUtils: Check if SparkR is installed before running tests that use SparkR.
isSplitable(SparkSession, Map<String, String>, Path) - Method in class org.apache.spark.sql.hive.orc.OrcFileFormat
isStarted() - Method in class org.apache.spark.streaming.receiver.Receiver: Check if the receiver has started or not.
isStopped() - Method in class org.apache.spark.SparkContext
isStopped() - Method in class org.apache.spark.streaming.receiver.Receiver: Check if receiver has been marked for stopping.
isStreaming() - Method in class org.apache.spark.sql.Dataset: Returns true if this Dataset contains one or more sources that continuously return data as it arrives.
isSubClassOf(Type, Class<?>) - Method in interface org.apache.spark.sql.hive.HiveInspectors
isTemporary() - Method in class org.apache.spark.sql.catalog.Function
isTemporary() - Method in class org.apache.spark.sql.catalog.Table
isTesting() - Static method in class org.apache.spark.util.Utils: Indicates whether Spark is currently running unit tests.
isTimingOut() - Method in class org.apache.spark.streaming.State: Whether the state is timing out and going to be removed by the system after the current batch.
isTraceEnabled() - Method in interface org.apache.spark.internal.Logging
isTransposed() - Method in class org.apache.spark.ml.linalg.DenseMatrix
isTransposed() - Method in interface org.apache.spark.ml.linalg.Matrix: Flag that keeps track whether the matrix is transposed or not.
isTransposed() - Method in class org.apache.spark.ml.linalg.SparseMatrix
isTransposed() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
isTransposed() - Method in interface org.apache.spark.mllib.linalg.Matrix: Flag that keeps track whether the matrix is transposed or not.
isTransposed() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
isTriggerActive() - Method in class org.apache.spark.sql.streaming.StreamingQueryStatus
isValid() - Method in class org.apache.spark.ml.param.Param
isValid() - Method in class org.apache.spark.storage.StorageLevel
isWindows() - Static method in class org.apache.spark.util.Utils: Whether the underlying operating system is Windows.
isZero() - Method in class org.apache.spark.sql.types.Decimal
isZero() - Method in class org.apache.spark.streaming.Duration
isZero() - Method in class org.apache.spark.util.AccumulatorV2: Returns if this accumulator is zero value or not.
isZero() - Method in class org.apache.spark.util.CollectionAccumulator: Returns false if this accumulator instance has any values in it.
isZero() - Method in class org.apache.spark.util.DoubleAccumulator: Returns false if this accumulator has had any values added to it or the sum is non-zero.
isZero() - Method in class org.apache.spark.util.LegacyAccumulatorWrapper
isZero() - Method in class org.apache.spark.util.LongAccumulator: Returns false if this accumulator has had any values added to it or the sum is non-zero.
item() - Method in class org.apache.spark.ml.recommendation.ALS.Rating
itemCol() - Method in interface org.apache.spark.ml.recommendation.ALSModelParams: Param for the column name for item ids.
itemFactors() - Method in class org.apache.spark.ml.recommendation.ALSModel
items() - Method in class org.apache.spark.mllib.fpm.FPGrowth.FreqItemset
itemsCol() - Method in interface org.apache.spark.ml.fpm.FPGrowthParams: Items column name.
itemSupport() - Method in class org.apache.spark.mllib.fpm.FPGrowthModel
iterator(Partition, TaskContext) - Method in interface org.apache.spark.api.java.JavaRDDLike: Internal method to this RDD; will read from cache if applicable, or otherwise compute it.
iterator(Partition, TaskContext) - Method in class org.apache.spark.rdd.RDD: Internal method to this RDD; will read from cache if applicable, or otherwise compute it.
iterator() - Method in class org.apache.spark.sql.types.StructType
iterator() - Method in class org.apache.spark.status.RDDPartitionSeq
IV_LENGTH_IN_BYTES() - Static method in class org.apache.spark.security.CryptoStreamUtils

J

j() - Method in class org.apache.spark.mllib.linalg.distributed.MatrixEntry
jarOfClass(Class<?>) - Static method in class org.apache.spark.api.java.JavaSparkContext: Find the JAR from which a given class was loaded, to make it easy for users to pass their JARs to SparkContext.
jarOfClass(Class<?>) - Static method in class org.apache.spark.SparkContext: Find the JAR from which a given class was loaded, to make it easy for users to pass their JARs to SparkContext.
jarOfClass(Class<?>) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Find the JAR from which a given class was loaded, to make it easy for users to pass their JARs to StreamingContext.
jarOfClass(Class<?>) - Static method in class org.apache.spark.streaming.StreamingContext: Find the JAR from which a given class was loaded, to make it easy for users to pass their JARs to StreamingContext.
jarOfObject(Object) - Static method in class org.apache.spark.api.java.JavaSparkContext: Find the JAR that contains the class of a particular object, to make it easy for users to pass their JARs to SparkContext.
jarOfObject(Object) - Static method in class org.apache.spark.SparkContext: Find the JAR that contains the class of a particular object, to make it easy for users to pass their JARs to SparkContext.
jars() - Method in class org.apache.spark.api.java.JavaSparkContext
jars() - Method in class org.apache.spark.SparkContext
javaAntecedent() - Method in class org.apache.spark.mllib.fpm.AssociationRules.Rule: Returns antecedent in a Java List.
javaCategoryMaps() - Method in class org.apache.spark.ml.feature.VectorIndexerModel: Java-friendly version of categoryMaps
javaConsequent() - Method in class org.apache.spark.mllib.fpm.AssociationRules.Rule: Returns consequent in a Java List.
JavaDoubleRDD - Class in org.apache.spark.api.java
JavaDoubleRDD(RDD<Object>) - Constructor for class org.apache.spark.api.java.JavaDoubleRDD
JavaDStream<T> - Class in org.apache.spark.streaming.api.java: A Java-friendly interface to DStream, the basic abstraction in Spark Streaming that represents a continuous stream of data.
JavaDStream(DStream<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.api.java.JavaDStream
JavaDStreamLike<T,This extends JavaDStreamLike<T,This,R>,R extends JavaRDDLike<T,R>> - Interface in org.apache.spark.streaming.api.java
JavaFutureAction<T> - Interface in org.apache.spark.api.java
JavaHadoopRDD<K,V> - Class in org.apache.spark.api.java
JavaHadoopRDD(HadoopRDD<K, V>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.api.java.JavaHadoopRDD
javaHome() - Method in class org.apache.spark.status.api.v1.RuntimeInfo
JavaInputDStream<T> - Class in org.apache.spark.streaming.api.java: A Java-friendly interface to InputDStream.
JavaInputDStream(InputDStream<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.api.java.JavaInputDStream
javaItems() - Method in class org.apache.spark.mllib.fpm.FPGrowth.FreqItemset: Returns items in a Java List.
JavaIterableWrapperSerializer - Class in org.apache.spark.serializer: A Kryo serializer for serializing results returned by asJavaIterable.
JavaIterableWrapperSerializer() - Constructor for class org.apache.spark.serializer.JavaIterableWrapperSerializer
JavaMapWithStateDStream<KeyType,ValueType,StateType,MappedType> - Class in org.apache.spark.streaming.api.java: :: Experimental :: DStream representing the stream of data generated by mapWithState operation on a JavaPairDStream.
JavaNewHadoopRDD<K,V> - Class in org.apache.spark.api.java
JavaNewHadoopRDD(NewHadoopRDD<K, V>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.api.java.JavaNewHadoopRDD
javaOcvTypes() - Static method in class org.apache.spark.ml.image.ImageSchema: (Java-specific) OpenCV type mapping supported
JavaPackage - Class in org.apache.spark.mllib: A dummy class as a workaround to show the package doc of spark.mllib in generated Java API docs.
JavaPairDStream<K,V> - Class in org.apache.spark.streaming.api.java: A Java-friendly interface to a DStream of key-value pairs, which provides extra methods like reduceByKey and join.
JavaPairDStream(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.streaming.api.java.JavaPairDStream
JavaPairInputDStream<K,V> - Class in org.apache.spark.streaming.api.java: A Java-friendly interface to InputDStream of key-value pairs.
JavaPairInputDStream(InputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.streaming.api.java.JavaPairInputDStream
JavaPairRDD<K,V> - Class in org.apache.spark.api.java
JavaPairRDD(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.api.java.JavaPairRDD
JavaPairReceiverInputDStream<K,V> - Class in org.apache.spark.streaming.api.java: A Java-friendly interface to ReceiverInputDStream, the abstract class for defining any input stream that receives data over the network.
JavaPairReceiverInputDStream(ReceiverInputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
JavaParams - Class in org.apache.spark.ml.param: :: DeveloperApi :: Java-friendly wrapper for Params.
JavaParams() - Constructor for class org.apache.spark.ml.param.JavaParams
JavaRDD<T> - Class in org.apache.spark.api.java
JavaRDD(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.api.java.JavaRDD
javaRDD() - Method in class org.apache.spark.sql.Dataset: Returns the content of the Dataset as a JavaRDD of Ts.
JavaRDDLike<T,This extends JavaRDDLike<T,This>> - Interface in org.apache.spark.api.java: Defines operations common to several Java RDD implementations.
JavaReceiverInputDStream<T> - Class in org.apache.spark.streaming.api.java: A Java-friendly interface to ReceiverInputDStream, the abstract class for defining any input stream that receives data over the network.
JavaReceiverInputDStream(ReceiverInputDStream<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
javaSequence() - Method in class org.apache.spark.mllib.fpm.PrefixSpan.FreqSequence: Returns sequence as a Java List of lists for Java users.
javaSerialization(ClassTag<T>) - Static method in class org.apache.spark.sql.Encoders: (Scala-specific) Creates an encoder that serializes objects of type T using generic Java serialization.
javaSerialization(Class<T>) - Static method in class org.apache.spark.sql.Encoders: Creates an encoder that serializes objects of type T using generic Java serialization.
JavaSerializer - Class in org.apache.spark.serializer: :: DeveloperApi :: A Spark serializer that uses Java's built-in serialization.
JavaSerializer(SparkConf) - Constructor for class org.apache.spark.serializer.JavaSerializer
JavaSparkContext - Class in org.apache.spark.api.java: A Java-friendly version of SparkContext that returns JavaRDDs and works with Java collections instead of Scala ones.
JavaSparkContext(SparkContext) - Constructor for class org.apache.spark.api.java.JavaSparkContext
JavaSparkContext() - Constructor for class org.apache.spark.api.java.JavaSparkContext: Create a JavaSparkContext that loads settings from system properties (for instance, when launching with ./bin/spark-submit).
JavaSparkContext(SparkConf) - Constructor for class org.apache.spark.api.java.JavaSparkContext
JavaSparkContext(String, String) - Constructor for class org.apache.spark.api.java.JavaSparkContext
JavaSparkContext(String, String, SparkConf) - Constructor for class org.apache.spark.api.java.JavaSparkContext
JavaSparkContext(String, String, String, String) - Constructor for class org.apache.spark.api.java.JavaSparkContext
JavaSparkContext(String, String, String, String[]) - Constructor for class org.apache.spark.api.java.JavaSparkContext
JavaSparkContext(String, String, String, String[], Map<String, String>) - Constructor for class org.apache.spark.api.java.JavaSparkContext
JavaSparkStatusTracker - Class in org.apache.spark.api.java: Low-level status reporting APIs for monitoring job and stage progress.
JavaStreamingContext - Class in org.apache.spark.streaming.api.java: A Java-friendly version of StreamingContext which is the main entry point for Spark Streaming functionality.
JavaStreamingContext(StreamingContext) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext
JavaStreamingContext(String, String, Duration) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a StreamingContext.
JavaStreamingContext(String, String, Duration, String, String) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a StreamingContext.
JavaStreamingContext(String, String, Duration, String, String[]) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a StreamingContext.
JavaStreamingContext(String, String, Duration, String, String[], Map<String, String>) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a StreamingContext.
JavaStreamingContext(JavaSparkContext, Duration) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a JavaStreamingContext using an existing JavaSparkContext.
JavaStreamingContext(SparkConf, Duration) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a JavaStreamingContext using a SparkConf configuration.
JavaStreamingContext(String) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext: Recreate a JavaStreamingContext from a checkpoint file.
JavaStreamingContext(String, Configuration) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext: Re-creates a JavaStreamingContext from a checkpoint file.
JavaStreamingListenerEvent - Interface in org.apache.spark.streaming.api.java: Base trait for events related to JavaStreamingListener
javaTopicAssignments() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel: Java-friendly version of topicAssignments
javaTopicDistributions() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel: Java-friendly version of topicDistributions
javaTopTopicsPerDocument(int) - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel: Java-friendly version of topTopicsPerDocument
javaTreeWeights() - Method in interface org.apache.spark.ml.tree.TreeEnsembleModel: Weights used by the python wrappers.
javaTypeToDataType(Type) - Method in interface org.apache.spark.sql.hive.HiveInspectors
javaTypeToDataType(Type) - Static method in class org.apache.spark.sql.hive.orc.OrcFileFormat
JavaUtils - Class in org.apache.spark.api.java
JavaUtils() - Constructor for class org.apache.spark.api.java.JavaUtils
JavaUtils.SerializableMapWrapper<A,B> - Class in org.apache.spark.api.java
javaVersion() - Method in class org.apache.spark.status.api.v1.RuntimeInfo
jdbc(String, String, Properties) - Method in class org.apache.spark.sql.DataFrameReader: Construct a DataFrame representing the database table accessible via JDBC URL url named table and connection properties.
jdbc(String, String, String, long, long, int, Properties) - Method in class org.apache.spark.sql.DataFrameReader: Construct a DataFrame representing the database table accessible via JDBC URL url named table.
jdbc(String, String, String[], Properties) - Method in class org.apache.spark.sql.DataFrameReader: Construct a DataFrame representing the database table accessible via JDBC URL url named table using connection properties.
jdbc(String, String, Properties) - Method in class org.apache.spark.sql.DataFrameWriter: Saves the content of the DataFrame to an external database table via JDBC.
jdbc(String, String) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
As of 1.4.0, replaced by read().jdbc().
jdbc(String, String, String, long, long, int) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
As of 1.4.0, replaced by read().jdbc().
jdbc(String, String, String[]) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
As of 1.4.0, replaced by read().jdbc().
JdbcDialect - Class in org.apache.spark.sql.jdbc: :: DeveloperApi :: Encapsulates everything (extensions, workarounds, quirks) to handle the SQL dialect of a certain database or jdbc driver.
JdbcDialect() - Constructor for class org.apache.spark.sql.jdbc.JdbcDialect
JdbcDialects - Class in org.apache.spark.sql.jdbc: :: DeveloperApi :: Registry of dialects that apply to every new jdbc org.apache.spark.sql.DataFrame.
JdbcDialects() - Constructor for class org.apache.spark.sql.jdbc.JdbcDialects
jdbcNullType() - Method in class org.apache.spark.sql.jdbc.JdbcType
JdbcRDD<T> - Class in org.apache.spark.rdd: An RDD that executes a SQL query on a JDBC connection and reads results.
JdbcRDD(SparkContext, Function0<Connection>, String, long, long, int, Function1<ResultSet, T>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.JdbcRDD
JdbcRDD.ConnectionFactory - Interface in org.apache.spark.rdd
JdbcType - Class in org.apache.spark.sql.jdbc: :: DeveloperApi :: A database type definition coupled with the jdbc type needed to send null values to the database.
JdbcType(String, int) - Constructor for class org.apache.spark.sql.jdbc.JdbcType
JettyUtils - Class in org.apache.spark.ui: Utilities for launching a web server using Jetty's HTTP Server class
JettyUtils() - Constructor for class org.apache.spark.ui.JettyUtils
JettyUtils.ServletParams<T> - Class in org.apache.spark.ui
JettyUtils.ServletParams$ - Class in org.apache.spark.ui
JOB_DAG() - Static method in class org.apache.spark.ui.ToolTips
JOB_TIMELINE() - Static method in class org.apache.spark.ui.ToolTips
JobData - Class in org.apache.spark.status.api.v1
jobEndFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
jobEndToJson(SparkListenerJobEnd) - Static method in class org.apache.spark.util.JsonProtocol
JobExecutionStatus - Enum in org.apache.spark
jobFailed(Exception) - Method in interface org.apache.spark.scheduler.JobListener
JobGeneratorEvent - Interface in org.apache.spark.streaming.scheduler: Event classes for JobGenerator
jobGroup() - Method in class org.apache.spark.status.api.v1.JobData
jobId() - Method in class org.apache.spark.scheduler.SparkListenerJobEnd
jobId() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
jobId() - Method in interface org.apache.spark.SparkJobInfo
jobId() - Method in class org.apache.spark.SparkJobInfoImpl
jobId() - Method in class org.apache.spark.status.api.v1.JobData
jobId() - Method in class org.apache.spark.status.LiveJob
jobID() - Method in class org.apache.spark.TaskCommitDenied
jobIds() - Method in interface org.apache.spark.api.java.JavaFutureAction: Returns the job IDs run by the underlying async operation.
jobIds() - Method in class org.apache.spark.ComplexFutureAction
jobIds() - Method in interface org.apache.spark.FutureAction: Returns the job IDs run by the underlying async operation.
jobIds() - Method in class org.apache.spark.SimpleFutureAction
jobIds() - Method in class org.apache.spark.status.api.v1.streaming.OutputOperationInfo
jobIds() - Method in class org.apache.spark.status.LiveStage
JobListener - Interface in org.apache.spark.scheduler: Interface used to listen for job completion or failure events after submitting a job to the DAGScheduler.
JobResult - Interface in org.apache.spark.scheduler: :: DeveloperApi :: A result of a job in the DAGScheduler.
jobResult() - Method in class org.apache.spark.scheduler.SparkListenerJobEnd
jobResultFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
jobResultToJson(JobResult) - Static method in class org.apache.spark.util.JsonProtocol
jobs() - Method in class org.apache.spark.status.LiveStage
JobSchedulerEvent - Interface in org.apache.spark.streaming.scheduler
jobStartFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
jobStartToJson(SparkListenerJobStart) - Static method in class org.apache.spark.util.JsonProtocol
JobSubmitter - Interface in org.apache.spark: Handle via which a "run" function passed to a ComplexFutureAction can submit jobs for execution.
JobSucceeded - Class in org.apache.spark.scheduler
JobSucceeded() - Constructor for class org.apache.spark.scheduler.JobSucceeded
join(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD containing all pairs of elements with matching keys in this and other.
join(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD containing all pairs of elements with matching keys in this and other.
join(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD containing all pairs of elements with matching keys in this and other.
join(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return an RDD containing all pairs of elements with matching keys in this and other.
join(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return an RDD containing all pairs of elements with matching keys in this and other.
join(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return an RDD containing all pairs of elements with matching keys in this and other.
join(Dataset<?>) - Method in class org.apache.spark.sql.Dataset: Join with another DataFrame.
join(Dataset<?>, String) - Method in class org.apache.spark.sql.Dataset: Inner equi-join with another DataFrame using the given column.
join(Dataset<?>, Seq<String>) - Method in class org.apache.spark.sql.Dataset: Inner equi-join with another DataFrame using the given columns.
join(Dataset<?>, Seq<String>, String) - Method in class org.apache.spark.sql.Dataset: Equi-join with another DataFrame using the given columns.
join(Dataset<?>, Column) - Method in class org.apache.spark.sql.Dataset: Inner join with another DataFrame, using the given join expression.
join(Dataset<?>, Column, String) - Method in class org.apache.spark.sql.Dataset: Join with another DataFrame, using the given join expression.
join(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
join(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
join(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
join(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
join(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
join(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
joinVertices(RDD<Tuple2<Object, U>>, Function3<Object, VD, U, VD>, ClassTag) - Method in class org.apache.spark.graphx.GraphOps: Join the vertices with an RDD and then apply a function from the vertex and RDD entry to a new vertex value.
joinWith(Dataset, Column, String) - Method in class org.apache.spark.sql.Dataset: :: Experimental :: Joins this Dataset returning a Tuple2 for each pair where condition evaluates to true.
joinWith(Dataset, Column) - Method in class org.apache.spark.sql.Dataset: :: Experimental :: Using inner equi-join to join this Dataset returning a Tuple2 for each pair where condition evaluates to true.
json(String...) - Method in class org.apache.spark.sql.DataFrameReader: Loads JSON files and returns the results as a DataFrame.
json(String) - Method in class org.apache.spark.sql.DataFrameReader: Loads a JSON file and returns the results as a DataFrame.
json(Seq<String>) - Method in class org.apache.spark.sql.DataFrameReader: Loads JSON files and returns the results as a DataFrame.
json(JavaRDD<String>) - Method in class org.apache.spark.sql.DataFrameReader: Deprecated.
Use json(Dataset[String]) instead. Since 2.2.0.
json(RDD<String>) - Method in class org.apache.spark.sql.DataFrameReader: Deprecated.
Use json(Dataset[String]) instead. Since 2.2.0.
json(Dataset<String>) - Method in class org.apache.spark.sql.DataFrameReader: Loads a Dataset[String] storing JSON objects (JSON Lines text format or newline-delimited JSON) and returns the result as a DataFrame.
json(String) - Method in class org.apache.spark.sql.DataFrameWriter: Saves the content of the DataFrame in JSON format ( JSON Lines text format or newline-delimited JSON) at the specified path.
json() - Method in class org.apache.spark.sql.sources.v2.reader.streaming.Offset: A JSON-serialized representation of an Offset that is used for saving offsets to the offset log.
json(String) - Method in class org.apache.spark.sql.streaming.DataStreamReader: Loads a JSON file stream and returns the results as a DataFrame.
json() - Method in class org.apache.spark.sql.streaming.SinkProgress: The compact JSON representation of this progress.
json() - Method in class org.apache.spark.sql.streaming.SourceProgress: The compact JSON representation of this progress.
json() - Method in class org.apache.spark.sql.streaming.StateOperatorProgress: The compact JSON representation of this progress.
json() - Method in class org.apache.spark.sql.streaming.StreamingQueryProgress: The compact JSON representation of this progress.
json() - Method in class org.apache.spark.sql.streaming.StreamingQueryStatus: The compact JSON representation of this status.
json() - Static method in class org.apache.spark.sql.types.BinaryType
json() - Static method in class org.apache.spark.sql.types.BooleanType
json() - Static method in class org.apache.spark.sql.types.ByteType
json() - Static method in class org.apache.spark.sql.types.CalendarIntervalType
json() - Method in class org.apache.spark.sql.types.DataType: The compact JSON representation of this data type.
json() - Static method in class org.apache.spark.sql.types.DateType
json() - Static method in class org.apache.spark.sql.types.DoubleType
json() - Static method in class org.apache.spark.sql.types.FloatType
json() - Static method in class org.apache.spark.sql.types.IntegerType
json() - Static method in class org.apache.spark.sql.types.LongType
json() - Method in class org.apache.spark.sql.types.Metadata: Converts to its JSON representation.
json() - Static method in class org.apache.spark.sql.types.NullType
json() - Static method in class org.apache.spark.sql.types.ShortType
json() - Static method in class org.apache.spark.sql.types.StringType
json() - Static method in class org.apache.spark.sql.types.TimestampType
json_tuple(Column, String...) - Static method in class org.apache.spark.sql.functions: Creates a new row for a json column according to the given field names.
json_tuple(Column, Seq<String>) - Static method in class org.apache.spark.sql.functions: Creates a new row for a json column according to the given field names.
jsonDecode(String) - Method in class org.apache.spark.ml.param.BooleanParam
jsonDecode(String) - Method in class org.apache.spark.ml.param.DoubleArrayArrayParam
jsonDecode(String) - Method in class org.apache.spark.ml.param.DoubleArrayParam
jsonDecode(String) - Method in class org.apache.spark.ml.param.DoubleParam
jsonDecode(String) - Method in class org.apache.spark.ml.param.FloatParam
jsonDecode(String) - Method in class org.apache.spark.ml.param.IntArrayParam
jsonDecode(String) - Method in class org.apache.spark.ml.param.IntParam
jsonDecode(String) - Method in class org.apache.spark.ml.param.LongParam
jsonDecode(String) - Method in class org.apache.spark.ml.param.Param: Decodes a param value from JSON.
jsonDecode(String) - Method in class org.apache.spark.ml.param.StringArrayParam
jsonEncode(boolean) - Method in class org.apache.spark.ml.param.BooleanParam
jsonEncode(double[][]) - Method in class org.apache.spark.ml.param.DoubleArrayArrayParam
jsonEncode(double[]) - Method in class org.apache.spark.ml.param.DoubleArrayParam
jsonEncode(double) - Method in class org.apache.spark.ml.param.DoubleParam
jsonEncode(float) - Method in class org.apache.spark.ml.param.FloatParam
jsonEncode(int[]) - Method in class org.apache.spark.ml.param.IntArrayParam
jsonEncode(int) - Method in class org.apache.spark.ml.param.IntParam
jsonEncode(long) - Method in class org.apache.spark.ml.param.LongParam
jsonEncode(T) - Method in class org.apache.spark.ml.param.Param: Encodes a param value into JSON, which can be decoded by `jsonDecode()`.
jsonEncode(String[]) - Method in class org.apache.spark.ml.param.StringArrayParam
jsonFile(String) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
As of 1.4.0, replaced by read().json().
jsonFile(String, StructType) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
As of 1.4.0, replaced by read().json().
jsonFile(String, double) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
As of 1.4.0, replaced by read().json().
JsonMatrixConverter - Class in org.apache.spark.ml.linalg
JsonMatrixConverter() - Constructor for class org.apache.spark.ml.linalg.JsonMatrixConverter
JsonProtocol - Class in org.apache.spark.util: Serializes SparkListener events to/from JSON.
JsonProtocol() - Constructor for class org.apache.spark.util.JsonProtocol
jsonRDD(RDD<String>) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
As of 1.4.0, replaced by read().json().
jsonRDD(JavaRDD<String>) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
As of 1.4.0, replaced by read().json().
jsonRDD(RDD<String>, StructType) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
As of 1.4.0, replaced by read().json().
jsonRDD(JavaRDD<String>, StructType) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
As of 1.4.0, replaced by read().json().
jsonRDD(RDD<String>, double) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
As of 1.4.0, replaced by read().json().
jsonRDD(JavaRDD<String>, double) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
As of 1.4.0, replaced by read().json().
jsonResponderToServlet(Function1<HttpServletRequest, JsonAST.JValue>) - Static method in class org.apache.spark.ui.JettyUtils
JsonVectorConverter - Class in org.apache.spark.ml.linalg
JsonVectorConverter() - Constructor for class org.apache.spark.ml.linalg.JsonVectorConverter
jValueDecode(JsonAST.JValue) - Static method in class org.apache.spark.ml.param.DoubleParam: Decodes a param value from JValue.
jValueDecode(JsonAST.JValue) - Static method in class org.apache.spark.ml.param.FloatParam: Decodes a param value from JValue.
jValueEncode(double) - Static method in class org.apache.spark.ml.param.DoubleParam: Encodes a param value into JValue.
jValueEncode(float) - Static method in class org.apache.spark.ml.param.FloatParam: Encodes a param value into JValue.
JVM_GC_TIME() - Static method in class org.apache.spark.InternalAccumulator
jvmGcTime() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
jvmGcTime() - Method in class org.apache.spark.status.api.v1.TaskMetrics

K

k() - Method in interface org.apache.spark.ml.clustering.BisectingKMeansParams: The desired number of leaf clusters.
k() - Method in class org.apache.spark.ml.clustering.ClusteringSummary
k() - Method in interface org.apache.spark.ml.clustering.GaussianMixtureParams: Number of independent Gaussians in the mixture model.
k() - Method in interface org.apache.spark.ml.clustering.KMeansParams: The number of clusters to create (k).
k() - Method in interface org.apache.spark.ml.clustering.LDAParams: Param for the number of topics (clusters) to infer.
k() - Method in interface org.apache.spark.ml.clustering.PowerIterationClusteringParams: The number of clusters to create (k).
k() - Method in interface org.apache.spark.ml.feature.PCAParams: The number of principal components.
k() - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel: Number of leaf clusters.
k() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
k() - Method in class org.apache.spark.mllib.clustering.ExpectationSum
k() - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel: Number of gaussians in mixture
k() - Method in class org.apache.spark.mllib.clustering.KMeansModel: Total number of clusters.
k() - Method in class org.apache.spark.mllib.clustering.LDAModel: Number of topics
k() - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
k() - Method in class org.apache.spark.mllib.clustering.PowerIterationClusteringModel
k() - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
k() - Method in class org.apache.spark.mllib.feature.PCA
k() - Method in class org.apache.spark.mllib.feature.PCAModel
K_MEANS_PARALLEL() - Static method in class org.apache.spark.mllib.clustering.KMeans
kClassTag() - Method in class org.apache.spark.api.java.JavaHadoopRDD
kClassTag() - Method in class org.apache.spark.api.java.JavaNewHadoopRDD
kClassTag() - Method in class org.apache.spark.api.java.JavaPairRDD
kClassTag() - Method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
kClassTag() - Method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
keepLastCheckpoint() - Method in interface org.apache.spark.ml.clustering.LDAParams: For EM optimizer only: optimizer = "em".
KernelDensity - Class in org.apache.spark.mllib.stat: Kernel density estimation.
KernelDensity() - Constructor for class org.apache.spark.mllib.stat.KernelDensity
keyArray() - Method in class org.apache.spark.sql.vectorized.ColumnarMap
keyAs(Encoder<L>) - Method in class org.apache.spark.sql.KeyValueGroupedDataset: Returns a new KeyValueGroupedDataset where the type of the key has been mapped to the specified type.
keyBy(Function<T, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Creates tuples of the elements in this RDD by applying f.
keyBy(Function1<T, K>) - Method in class org.apache.spark.rdd.RDD: Creates tuples of the elements in this RDD by applying f.
keyOrdering() - Method in class org.apache.spark.ShuffleDependency
keyPrefix() - Method in interface org.apache.spark.sql.sources.v2.SessionConfigSupport: Key prefix of the session configs to propagate.
keys() - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD with the keys of each tuple.
keys() - Method in class org.apache.spark.rdd.PairRDDFunctions: Return an RDD with the keys of each tuple.
keys() - Method in class org.apache.spark.sql.KeyValueGroupedDataset: Returns a Dataset that contains each unique key.
keyType() - Method in class org.apache.spark.sql.types.MapType
KeyValueGroupedDataset<K,V> - Class in org.apache.spark.sql: :: Experimental :: A Dataset has been logically grouped by a user specified grouping key.
kFold(RDD<T>, int, int, ClassTag<T>) - Static method in class org.apache.spark.mllib.util.MLUtils: Return a k element array of pairs of RDDs with the first element of each pair containing the training data, a complement of the validation data and the second element, the validation data, containing a unique 1/kth of the data.
kFold(RDD<T>, int, long, ClassTag<T>) - Static method in class org.apache.spark.mllib.util.MLUtils: Version of kFold() taking a Long seed.
kill() - Method in interface org.apache.spark.launcher.SparkAppHandle: Tries to kill the underlying application.
killAllTaskAttempts(int, boolean, String) - Method in interface org.apache.spark.scheduler.TaskScheduler
killed() - Method in class org.apache.spark.scheduler.TaskInfo
KILLED() - Static method in class org.apache.spark.TaskState
killedSummary() - Method in class org.apache.spark.status.LiveJob
killedSummary() - Method in class org.apache.spark.status.LiveStage
killedTasks() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
killedTasks() - Method in class org.apache.spark.status.LiveExecutorStageSummary
killedTasks() - Method in class org.apache.spark.status.LiveJob
killedTasks() - Method in class org.apache.spark.status.LiveStage
killedTasksSummary() - Method in class org.apache.spark.status.api.v1.JobData
killedTasksSummary() - Method in class org.apache.spark.status.api.v1.StageData
killExecutor(String) - Method in interface org.apache.spark.ExecutorAllocationClient: Request that the cluster manager kill the specified executor.
killExecutor(String) - Method in class org.apache.spark.SparkContext: :: DeveloperApi :: Request that the cluster manager kill the specified executor.
killExecutors(Seq<String>, boolean, boolean, boolean) - Method in interface org.apache.spark.ExecutorAllocationClient: Request that the cluster manager kill the specified executors.
KillExecutors(Seq<String>) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillExecutors
killExecutors(Seq<String>) - Method in class org.apache.spark.SparkContext: :: DeveloperApi :: Request that the cluster manager kill the specified executors.
KillExecutors$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillExecutors$
killExecutorsOnHost(String) - Method in interface org.apache.spark.ExecutorAllocationClient: Request that the cluster manager kill every executor on the specified host.
KillExecutorsOnHost(String) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillExecutorsOnHost
KillExecutorsOnHost$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillExecutorsOnHost$
KillTask(long, String, boolean, String) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillTask
KillTask - Class in org.apache.spark.scheduler.local
KillTask(long, boolean, String) - Constructor for class org.apache.spark.scheduler.local.KillTask
killTask(long, String, boolean, String) - Method in interface org.apache.spark.scheduler.SchedulerBackend: Requests that an executor kills a running task.
KillTask$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillTask$
killTaskAttempt(long, boolean, String) - Method in interface org.apache.spark.scheduler.TaskScheduler: Kills a task attempt.
killTaskAttempt(long, boolean, String) - Method in class org.apache.spark.SparkContext: Kill and reschedule the given task attempt.
KinesisDataGenerator - Interface in org.apache.spark.streaming.kinesis: A wrapper interface that will allow us to consolidate the code for synthetic data generation.
KinesisInitialPositions - Class in org.apache.spark.streaming.kinesis
KinesisInitialPositions() - Constructor for class org.apache.spark.streaming.kinesis.KinesisInitialPositions
KinesisInitialPositions.AtTimestamp - Class in org.apache.spark.streaming.kinesis
KinesisInitialPositions.Latest - Class in org.apache.spark.streaming.kinesis
KinesisInitialPositions.TrimHorizon - Class in org.apache.spark.streaming.kinesis
KinesisUtils - Class in org.apache.spark.streaming.kinesis
KinesisUtils() - Constructor for class org.apache.spark.streaming.kinesis.KinesisUtils
KinesisUtilsPythonHelper - Class in org.apache.spark.streaming.kinesis: This is a helper class that wraps the methods in KinesisUtils into more Python-friendly class and function so that it can be easily instantiated and called from Python's KinesisUtils.
KinesisUtilsPythonHelper() - Constructor for class org.apache.spark.streaming.kinesis.KinesisUtilsPythonHelper
kManifest() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
KMeans - Class in org.apache.spark.ml.clustering: K-means clustering with support for k-means|| initialization proposed by Bahmani et al.
KMeans(String) - Constructor for class org.apache.spark.ml.clustering.KMeans
KMeans() - Constructor for class org.apache.spark.ml.clustering.KMeans
KMeans - Class in org.apache.spark.mllib.clustering: K-means clustering with a k-means++ like initialization mode (the k-means|| algorithm by Bahmani et al).
KMeans() - Constructor for class org.apache.spark.mllib.clustering.KMeans: Constructs a KMeans instance with default parameters: {k: 2, maxIterations: 20, initializationMode: "k-means||", initializationSteps: 2, epsilon: 1e-4, seed: random, distanceMeasure: "euclidean"}.
KMeansDataGenerator - Class in org.apache.spark.mllib.util: :: DeveloperApi :: Generate test data for KMeans.
KMeansDataGenerator() - Constructor for class org.apache.spark.mllib.util.KMeansDataGenerator
KMeansModel - Class in org.apache.spark.ml.clustering: Model fitted by KMeans.
KMeansModel - Class in org.apache.spark.mllib.clustering: A clustering model for K-means.
KMeansModel(Vector[], String, double, int) - Constructor for class org.apache.spark.mllib.clustering.KMeansModel
KMeansModel(Vector[]) - Constructor for class org.apache.spark.mllib.clustering.KMeansModel
KMeansModel(Iterable<Vector>) - Constructor for class org.apache.spark.mllib.clustering.KMeansModel: A Java-friendly constructor that takes an Iterable of Vectors.
KMeansModel.SaveLoadV1_0$ - Class in org.apache.spark.mllib.clustering
KMeansModel.SaveLoadV2_0$ - Class in org.apache.spark.mllib.clustering
KMeansParams - Interface in org.apache.spark.ml.clustering: Common params for KMeans and KMeansModel
kMeansPlusPlus(int, VectorWithNorm[], double[], int, int) - Static method in class org.apache.spark.mllib.clustering.LocalKMeans: Run K-means++ on the weighted point set points.
KMeansSummary - Class in org.apache.spark.ml.clustering: :: Experimental :: Summary of KMeans.
KnownSizeEstimation - Interface in org.apache.spark.util: A trait that allows a class to give SizeEstimator more accurate size estimation.
KolmogorovSmirnovTest - Class in org.apache.spark.ml.stat: :: Experimental ::
KolmogorovSmirnovTest() - Constructor for class org.apache.spark.ml.stat.KolmogorovSmirnovTest
kolmogorovSmirnovTest(RDD<Object>, String, double...) - Static method in class org.apache.spark.mllib.stat.Statistics: Convenience function to conduct a one-sample, two-sided Kolmogorov-Smirnov test for probability distribution equality.
kolmogorovSmirnovTest(JavaDoubleRDD, String, double...) - Static method in class org.apache.spark.mllib.stat.Statistics: Java-friendly version of kolmogorovSmirnovTest()
kolmogorovSmirnovTest(RDD<Object>, Function1<Object, Object>) - Static method in class org.apache.spark.mllib.stat.Statistics: Conduct the two-sided Kolmogorov-Smirnov (KS) test for data sampled from a continuous distribution.
kolmogorovSmirnovTest(RDD<Object>, String, Seq<Object>) - Static method in class org.apache.spark.mllib.stat.Statistics: Convenience function to conduct a one-sample, two-sided Kolmogorov-Smirnov test for probability distribution equality.
kolmogorovSmirnovTest(JavaDoubleRDD, String, Seq<Object>) - Static method in class org.apache.spark.mllib.stat.Statistics: Java-friendly version of kolmogorovSmirnovTest()
KolmogorovSmirnovTest - Class in org.apache.spark.mllib.stat.test: Conduct the two-sided Kolmogorov Smirnov (KS) test for data sampled from a continuous distribution.
KolmogorovSmirnovTest() - Constructor for class org.apache.spark.mllib.stat.test.KolmogorovSmirnovTest
KolmogorovSmirnovTest.NullHypothesis$ - Class in org.apache.spark.mllib.stat.test
KolmogorovSmirnovTestResult - Class in org.apache.spark.mllib.stat.test: Object containing the test results for the Kolmogorov-Smirnov test.
kryo(ClassTag<T>) - Static method in class org.apache.spark.sql.Encoders: (Scala-specific) Creates an encoder that serializes objects of type T using Kryo.
kryo(Class<T>) - Static method in class org.apache.spark.sql.Encoders: Creates an encoder that serializes objects of type T using Kryo.
KryoRegistrator - Interface in org.apache.spark.serializer: Interface implemented by clients to register their classes with Kryo when using Kryo serialization.
KryoSerializer - Class in org.apache.spark.serializer: A Spark serializer that uses the Kryo serialization library.
KryoSerializer(SparkConf) - Constructor for class org.apache.spark.serializer.KryoSerializer
kurtosis(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the kurtosis of the values in a group.
kurtosis(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the kurtosis of the values in a group.
KVUtils - Class in org.apache.spark.status
KVUtils() - Constructor for class org.apache.spark.status.KVUtils

L

L1Updater - Class in org.apache.spark.mllib.optimization: :: DeveloperApi :: Updater for L1 regularized problems.
L1Updater() - Constructor for class org.apache.spark.mllib.optimization.L1Updater
label() - Method in class org.apache.spark.ml.feature.LabeledPoint
label() - Method in class org.apache.spark.mllib.regression.LabeledPoint
labelCol() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary: Field in "predictions" which gives the true label of each instance (if available).
labelCol() - Method in class org.apache.spark.ml.classification.LogisticRegressionSummaryImpl
labelCol() - Method in interface org.apache.spark.ml.param.shared.HasLabelCol: Param for label column name.
labelCol() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary
LabeledPoint - Class in org.apache.spark.ml.feature: Class that represents the features and label of a data point.
LabeledPoint(double, Vector) - Constructor for class org.apache.spark.ml.feature.LabeledPoint
LabeledPoint - Class in org.apache.spark.mllib.regression: Class that represents the features and labels of a data point.
LabeledPoint(double, Vector) - Constructor for class org.apache.spark.mllib.regression.LabeledPoint
LabelPropagation - Class in org.apache.spark.graphx.lib: Label Propagation algorithm.
LabelPropagation() - Constructor for class org.apache.spark.graphx.lib.LabelPropagation
labels() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary: Returns the sequence of labels in ascending order.
labels() - Method in class org.apache.spark.ml.feature.IndexToString: Optional param for array of labels specifying index-string mapping.
labels() - Method in class org.apache.spark.ml.feature.StringIndexerModel
labels() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
labels() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV1_0$.Data
labels() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV2_0$.Data
labels() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns the sequence of labels in ascending order
labels() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics: Returns the sequence of labels in ascending order
lag(Column, int) - Static method in class org.apache.spark.sql.functions: Window function: returns the value that is offset rows before the current row, and null if there is less than offset rows before the current row.
lag(String, int) - Static method in class org.apache.spark.sql.functions: Window function: returns the value that is offset rows before the current row, and null if there is less than offset rows before the current row.
lag(String, int, Object) - Static method in class org.apache.spark.sql.functions: Window function: returns the value that is offset rows before the current row, and defaultValue if there is less than offset rows before the current row.
lag(Column, int, Object) - Static method in class org.apache.spark.sql.functions: Window function: returns the value that is offset rows before the current row, and defaultValue if there is less than offset rows before the current row.
LassoModel - Class in org.apache.spark.mllib.regression: Regression model trained using Lasso.
LassoModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.LassoModel
LassoWithSGD - Class in org.apache.spark.mllib.regression: Train a regression model with L1-regularization using Stochastic Gradient Descent.
LassoWithSGD() - Constructor for class org.apache.spark.mllib.regression.LassoWithSGD: Deprecated.
Use ml.regression.LinearRegression with elasticNetParam = 1.0. Note the default regParam is 0.01 for LassoWithSGD, but is 0.0 for LinearRegression. Since 2.0.0.
last(Column, boolean) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the last value in a group.
last(String, boolean) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the last value of the column in a group.
last(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the last value in a group.
last(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the last value of the column in a group.
last_day(Column) - Static method in class org.apache.spark.sql.functions: Returns the last day of the month which the given date belongs to.
lastDir() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
lastError() - Method in class org.apache.spark.status.api.v1.streaming.ReceiverInfo
lastError() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
lastErrorMessage() - Method in class org.apache.spark.status.api.v1.streaming.ReceiverInfo
lastErrorMessage() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
lastErrorTime() - Method in class org.apache.spark.status.api.v1.streaming.ReceiverInfo
lastErrorTime() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
lastProgress() - Method in interface org.apache.spark.sql.streaming.StreamingQuery: Returns the most recent StreamingQueryProgress update of this streaming query.
lastStageNameAndDescription(org.apache.spark.status.AppStatusStore, JobData) - Static method in class org.apache.spark.ui.jobs.ApiHelper
lastUpdate() - Method in class org.apache.spark.status.LiveRDDDistribution
lastUpdated() - Method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
Latest() - Constructor for class org.apache.spark.streaming.kinesis.KinesisInitialPositions.Latest
latestModel() - Method in class org.apache.spark.mllib.clustering.StreamingKMeans: Return the latest model.
latestModel() - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm: Return the latest model.
launch() - Method in class org.apache.spark.launcher.SparkLauncher: Launches a sub-process that will start the configured Spark application.
LAUNCH_TIME() - Static method in class org.apache.spark.status.TaskIndexNames
LAUNCHING() - Static method in class org.apache.spark.TaskState
LaunchTask(org.apache.spark.util.SerializableBuffer) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.LaunchTask
LaunchTask$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.LaunchTask$
launchTime() - Method in class org.apache.spark.scheduler.TaskInfo
launchTime() - Method in class org.apache.spark.status.api.v1.TaskData
Layer - Interface in org.apache.spark.ml.ann: Trait that holds Layer properties, that are needed to instantiate it.
LayerModel - Interface in org.apache.spark.ml.ann: Trait that holds Layer weights (or parameters).
layerModels() - Method in interface org.apache.spark.ml.ann.TopologyModel: Array of layer models
layers() - Method in interface org.apache.spark.ml.ann.TopologyModel: Array of layers
layers() - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassificationModel
layers() - Method in interface org.apache.spark.ml.classification.MultilayerPerceptronParams: Layer sizes including input size and output size.
LBFGS - Class in org.apache.spark.mllib.optimization: :: DeveloperApi :: Class used to solve an optimization problem using Limited-memory BFGS.
LBFGS(Gradient, Updater) - Constructor for class org.apache.spark.mllib.optimization.LBFGS
LDA - Class in org.apache.spark.ml.clustering: Latent Dirichlet Allocation (LDA), a topic model designed for text documents.
LDA(String) - Constructor for class org.apache.spark.ml.clustering.LDA
LDA() - Constructor for class org.apache.spark.ml.clustering.LDA
LDA - Class in org.apache.spark.mllib.clustering: Latent Dirichlet Allocation (LDA), a topic model designed for text documents.
LDA() - Constructor for class org.apache.spark.mllib.clustering.LDA: Constructs a LDA instance with default parameters.
LDAModel - Class in org.apache.spark.ml.clustering: Model fitted by LDA.
LDAModel - Class in org.apache.spark.mllib.clustering: Latent Dirichlet Allocation (LDA) model.
LDAOptimizer - Interface in org.apache.spark.mllib.clustering: :: DeveloperApi ::
LDAParams - Interface in org.apache.spark.ml.clustering
LDAUtils - Class in org.apache.spark.mllib.clustering: Utility methods for LDA.
LDAUtils() - Constructor for class org.apache.spark.mllib.clustering.LDAUtils
lead(String, int) - Static method in class org.apache.spark.sql.functions: Window function: returns the value that is offset rows after the current row, and null if there is less than offset rows after the current row.
lead(Column, int) - Static method in class org.apache.spark.sql.functions: Window function: returns the value that is offset rows after the current row, and null if there is less than offset rows after the current row.
lead(String, int, Object) - Static method in class org.apache.spark.sql.functions: Window function: returns the value that is offset rows after the current row, and defaultValue if there is less than offset rows after the current row.
lead(Column, int, Object) - Static method in class org.apache.spark.sql.functions: Window function: returns the value that is offset rows after the current row, and defaultValue if there is less than offset rows after the current row.
LeafNode - Class in org.apache.spark.ml.tree: Decision tree leaf node.
learningDecay() - Method in interface org.apache.spark.ml.clustering.LDAParams: For Online optimizer only: optimizer = "online".
learningOffset() - Method in interface org.apache.spark.ml.clustering.LDAParams: For Online optimizer only: optimizer = "online".
learningRate() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
least(Column...) - Static method in class org.apache.spark.sql.functions: Returns the least value of the list of values, skipping null values.
least(String, String...) - Static method in class org.apache.spark.sql.functions: Returns the least value of the list of column names, skipping null values.
least(Seq<Column>) - Static method in class org.apache.spark.sql.functions: Returns the least value of the list of values, skipping null values.
least(String, Seq<String>) - Static method in class org.apache.spark.sql.functions: Returns the least value of the list of column names, skipping null values.
LeastSquaresGradient - Class in org.apache.spark.mllib.optimization: :: DeveloperApi :: Compute gradient and loss for a Least-squared loss function, as used in linear regression.
LeastSquaresGradient() - Constructor for class org.apache.spark.mllib.optimization.LeastSquaresGradient
left() - Method in class org.apache.spark.sql.sources.And
left() - Method in class org.apache.spark.sql.sources.Or
leftCategories() - Method in class org.apache.spark.ml.tree.CategoricalSplit: Get sorted categories which split to the left
leftCategoriesOrThreshold() - Method in class org.apache.spark.ml.tree.DecisionTreeModelReadWrite.SplitData
leftChild() - Method in class org.apache.spark.ml.tree.DecisionTreeModelReadWrite.NodeData
leftChild() - Method in class org.apache.spark.ml.tree.InternalNode
leftChildIndex(int) - Static method in class org.apache.spark.mllib.tree.model.Node: Return the index of the left child of this node.
leftImpurity() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
leftJoin(RDD<Tuple2<Object, VD2>>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
leftJoin(RDD<Tuple2<Object, VD2>>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.VertexRDD: Left joins this VertexRDD with an RDD containing vertex attribute pairs.
leftNode() - Method in class org.apache.spark.mllib.tree.model.Node
leftNodeId() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
leftOuterJoin(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Perform a left outer join of this and other.
leftOuterJoin(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD: Perform a left outer join of this and other.
leftOuterJoin(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Perform a left outer join of this and other.
leftOuterJoin(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions: Perform a left outer join of this and other.
leftOuterJoin(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Perform a left outer join of this and other.
leftOuterJoin(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: Perform a left outer join of this and other.
leftOuterJoin(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
leftOuterJoin(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
leftOuterJoin(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
leftOuterJoin(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
leftOuterJoin(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
leftOuterJoin(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
leftPredict() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
leftZipJoin(VertexRDD<VD2>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
leftZipJoin(VertexRDD<VD2>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.VertexRDD: Left joins this RDD with another VertexRDD with the same index.
LegacyAccumulatorWrapper<R,T> - Class in org.apache.spark.util
LegacyAccumulatorWrapper(R, AccumulableParam<R, T>) - Constructor for class org.apache.spark.util.LegacyAccumulatorWrapper
length() - Method in class org.apache.spark.scheduler.SplitInfo
length(Column) - Static method in class org.apache.spark.sql.functions: Computes the character length of a given string or number of bytes of a binary string.
length() - Method in interface org.apache.spark.sql.Row: Number of elements in the Row.
length() - Method in class org.apache.spark.sql.types.CharType
length() - Method in class org.apache.spark.sql.types.HiveStringType
length() - Method in class org.apache.spark.sql.types.StructType
length() - Method in class org.apache.spark.sql.types.VarcharType
length() - Method in class org.apache.spark.status.RDDPartitionSeq
leq(Object) - Method in class org.apache.spark.sql.Column: Less than or equal to.
less(Duration) - Method in class org.apache.spark.streaming.Duration
less(Time) - Method in class org.apache.spark.streaming.Time
lessEq(Duration) - Method in class org.apache.spark.streaming.Duration
lessEq(Time) - Method in class org.apache.spark.streaming.Time
LessThan - Class in org.apache.spark.sql.sources: A filter that evaluates to true iff the attribute evaluates to a value less than value.
LessThan(String, Object) - Constructor for class org.apache.spark.sql.sources.LessThan
LessThanOrEqual - Class in org.apache.spark.sql.sources: A filter that evaluates to true iff the attribute evaluates to a value less than or equal to value.
LessThanOrEqual(String, Object) - Constructor for class org.apache.spark.sql.sources.LessThanOrEqual
levenshtein(Column, Column) - Static method in class org.apache.spark.sql.functions: Computes the Levenshtein distance of the two given string columns.
libraryPathEnvName() - Static method in class org.apache.spark.util.Utils: Return the current system LD_LIBRARY_PATH name
libraryPathEnvPrefix(Seq<String>) - Static method in class org.apache.spark.util.Utils: Return the prefix of a command that appends the given library paths to the system-specific library path environment variable.
LibSVMDataSource - Class in org.apache.spark.ml.source.libsvm: libsvm package implements Spark SQL data source API for loading LIBSVM data as DataFrame.
LibSVMDataSource() - Constructor for class org.apache.spark.ml.source.libsvm.LibSVMDataSource
lift() - Method in class org.apache.spark.mllib.fpm.AssociationRules.Rule: Returns the lift of the rule.
like(String) - Method in class org.apache.spark.sql.Column: SQL like expression.
limit(int) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset by taking the first n rows.
line() - Method in exception org.apache.spark.sql.AnalysisException
LinearDataGenerator - Class in org.apache.spark.mllib.util: :: DeveloperApi :: Generate sample data used for Linear Data.
LinearDataGenerator() - Constructor for class org.apache.spark.mllib.util.LinearDataGenerator
LinearRegression - Class in org.apache.spark.ml.regression: Linear regression.
LinearRegression(String) - Constructor for class org.apache.spark.ml.regression.LinearRegression
LinearRegression() - Constructor for class org.apache.spark.ml.regression.LinearRegression
LinearRegressionModel - Class in org.apache.spark.ml.regression: Model produced by LinearRegression.
LinearRegressionModel - Class in org.apache.spark.mllib.regression: Regression model trained using LinearRegression.
LinearRegressionModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.LinearRegressionModel
LinearRegressionParams - Interface in org.apache.spark.ml.regression: Params for linear regression.
LinearRegressionSummary - Class in org.apache.spark.ml.regression: :: Experimental :: Linear regression results evaluated on a dataset.
LinearRegressionTrainingSummary - Class in org.apache.spark.ml.regression: :: Experimental :: Linear regression training results.
LinearRegressionWithSGD - Class in org.apache.spark.mllib.regression: Train a linear regression model with no regularization using Stochastic Gradient Descent.
LinearRegressionWithSGD() - Constructor for class org.apache.spark.mllib.regression.LinearRegressionWithSGD: Deprecated.
Use ml.regression.LinearRegression or LBFGS. Since 2.0.0.
LinearSVC - Class in org.apache.spark.ml.classification: :: Experimental ::
LinearSVC(String) - Constructor for class org.apache.spark.ml.classification.LinearSVC
LinearSVC() - Constructor for class org.apache.spark.ml.classification.LinearSVC
LinearSVCModel - Class in org.apache.spark.ml.classification: :: Experimental :: Linear SVM Model trained by LinearSVC
LinearSVCParams - Interface in org.apache.spark.ml.classification: Params for linear SVM Classifier.
link(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.CLogLog$
link(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Identity$
link(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Inverse$
link(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Log$
link(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Logit$
link(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Probit$
link(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Sqrt$
link() - Method in interface org.apache.spark.ml.regression.GeneralizedLinearRegressionBase: Param for the name of link function which provides the relationship between the linear predictor and the mean of the distribution function.
Link$() - Constructor for class org.apache.spark.ml.regression.GeneralizedLinearRegression.Link$
linkPower() - Method in interface org.apache.spark.ml.regression.GeneralizedLinearRegressionBase: Param for the index in the power link function.
linkPredictionCol() - Method in interface org.apache.spark.ml.regression.GeneralizedLinearRegressionBase: Param for link prediction (linear predictor) column name.
listColumns(String) - Method in class org.apache.spark.sql.catalog.Catalog: Returns a list of columns for the given table/view or temporary view.
listColumns(String, String) - Method in class org.apache.spark.sql.catalog.Catalog: Returns a list of columns for the given table/view in the specified database.
listDatabases() - Method in class org.apache.spark.sql.catalog.Catalog: Returns a list of databases available across all sessions.
listDatabases(String) - Method in interface org.apache.spark.sql.hive.client.HiveClient: List the names of all the databases that match the specified pattern.
ListenerBus<L,E> - Interface in org.apache.spark.util: An event bus which posts events to its listeners.
listenerManager() - Method in class org.apache.spark.sql.SparkSession: :: Experimental :: An interface to register custom QueryExecutionListeners that listen for execution metrics.
listenerManager() - Method in class org.apache.spark.sql.SQLContext: An interface to register custom QueryExecutionListeners that listen for execution metrics.
listeners() - Method in interface org.apache.spark.util.ListenerBus
listFiles() - Method in class org.apache.spark.SparkContext: Returns a list of file paths that are added to resources.
listFunctions() - Method in class org.apache.spark.sql.catalog.Catalog: Returns a list of functions registered in the current database.
listFunctions(String) - Method in class org.apache.spark.sql.catalog.Catalog: Returns a list of functions registered in the specified database.
listFunctions(String, String) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Return the names of all functions that match the given pattern in the database.
listingTable(Seq<String>, Function1<T, Seq<Node>>, Iterable<T>, boolean, Option<String>, Seq<String>, boolean, boolean) - Static method in class org.apache.spark.ui.UIUtils: Returns an HTML table constructed by generating a row for each object in a sequence.
listJars() - Method in class org.apache.spark.SparkContext: Returns a list of jar files that are added to resources.
listOrcFiles(String, Configuration) - Static method in class org.apache.spark.sql.hive.orc.OrcFileOperator
listTables() - Method in class org.apache.spark.sql.catalog.Catalog: Returns a list of tables/views in the current database.
listTables(String) - Method in class org.apache.spark.sql.catalog.Catalog: Returns a list of tables/views in the specified database.
listTables(String) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Returns the names of all tables in the given database.
listTables(String, String) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Returns the names of tables in the given database that matches the given pattern.
lit(Object) - Static method in class org.apache.spark.sql.functions: Creates a Column of literal value.
literal(String) - Static method in class org.apache.spark.ml.feature.RFormulaParser
LIVE_ENTITY_UPDATE_MIN_FLUSH_PERIOD() - Static method in class org.apache.spark.status.config
LIVE_ENTITY_UPDATE_PERIOD() - Static method in class org.apache.spark.status.config
LiveEntityHelpers - Class in org.apache.spark.status
LiveEntityHelpers() - Constructor for class org.apache.spark.status.LiveEntityHelpers
LiveExecutor - Class in org.apache.spark.status
LiveExecutor(String, long) - Constructor for class org.apache.spark.status.LiveExecutor
LiveExecutorStageSummary - Class in org.apache.spark.status
LiveExecutorStageSummary(int, int, String) - Constructor for class org.apache.spark.status.LiveExecutorStageSummary
LiveJob - Class in org.apache.spark.status
LiveJob(int, String, Option<String>, Option<Date>, Seq<Object>, Option<String>, int) - Constructor for class org.apache.spark.status.LiveJob
LiveRDD - Class in org.apache.spark.status
LiveRDD(RDDInfo) - Constructor for class org.apache.spark.status.LiveRDD
LiveRDDDistribution - Class in org.apache.spark.status
LiveRDDDistribution(LiveExecutor) - Constructor for class org.apache.spark.status.LiveRDDDistribution
LiveRDDPartition - Class in org.apache.spark.status
LiveRDDPartition(String) - Constructor for class org.apache.spark.status.LiveRDDPartition
LiveStage - Class in org.apache.spark.status
LiveStage() - Constructor for class org.apache.spark.status.LiveStage
LiveTask - Class in org.apache.spark.status
LiveTask(TaskInfo, int, int, Option<Object>) - Constructor for class org.apache.spark.status.LiveTask
load(String) - Static method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
load(String) - Static method in class org.apache.spark.ml.classification.DecisionTreeClassifier
load(String) - Static method in class org.apache.spark.ml.classification.GBTClassificationModel
load(String) - Static method in class org.apache.spark.ml.classification.GBTClassifier
load(String) - Static method in class org.apache.spark.ml.classification.LinearSVC
load(String) - Static method in class org.apache.spark.ml.classification.LinearSVCModel
load(String) - Static method in class org.apache.spark.ml.classification.LogisticRegression
load(String) - Static method in class org.apache.spark.ml.classification.LogisticRegressionModel
load(String) - Static method in class org.apache.spark.ml.classification.MultilayerPerceptronClassificationModel
load(String) - Static method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier
load(String) - Static method in class org.apache.spark.ml.classification.NaiveBayes
load(String) - Static method in class org.apache.spark.ml.classification.NaiveBayesModel
load(String) - Static method in class org.apache.spark.ml.classification.OneVsRest
load(String) - Static method in class org.apache.spark.ml.classification.OneVsRestModel
load(String) - Static method in class org.apache.spark.ml.classification.RandomForestClassificationModel
load(String) - Static method in class org.apache.spark.ml.classification.RandomForestClassifier
load(String) - Static method in class org.apache.spark.ml.clustering.BisectingKMeans
load(String) - Static method in class org.apache.spark.ml.clustering.BisectingKMeansModel
load(String) - Static method in class org.apache.spark.ml.clustering.DistributedLDAModel
load(String) - Static method in class org.apache.spark.ml.clustering.GaussianMixture
load(String) - Static method in class org.apache.spark.ml.clustering.GaussianMixtureModel
load(String) - Static method in class org.apache.spark.ml.clustering.KMeans
load(String) - Static method in class org.apache.spark.ml.clustering.KMeansModel
load(String) - Static method in class org.apache.spark.ml.clustering.LDA
load(String) - Static method in class org.apache.spark.ml.clustering.LocalLDAModel
load(String) - Static method in class org.apache.spark.ml.clustering.PowerIterationClustering
load(String) - Static method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
load(String) - Static method in class org.apache.spark.ml.evaluation.ClusteringEvaluator
load(String) - Static method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
load(String) - Static method in class org.apache.spark.ml.evaluation.RegressionEvaluator
load(String) - Static method in class org.apache.spark.ml.feature.Binarizer
load(String) - Static method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSH
load(String) - Static method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSHModel
load(String) - Static method in class org.apache.spark.ml.feature.Bucketizer
load(String) - Static method in class org.apache.spark.ml.feature.ChiSqSelector
load(String) - Static method in class org.apache.spark.ml.feature.ChiSqSelectorModel
load(String) - Static method in class org.apache.spark.ml.feature.ColumnPruner
load(String) - Static method in class org.apache.spark.ml.feature.CountVectorizer
load(String) - Static method in class org.apache.spark.ml.feature.CountVectorizerModel
load(String) - Static method in class org.apache.spark.ml.feature.DCT
load(String) - Static method in class org.apache.spark.ml.feature.ElementwiseProduct
load(String) - Static method in class org.apache.spark.ml.feature.FeatureHasher
load(String) - Static method in class org.apache.spark.ml.feature.HashingTF
load(String) - Static method in class org.apache.spark.ml.feature.IDF
load(String) - Static method in class org.apache.spark.ml.feature.IDFModel
load(String) - Static method in class org.apache.spark.ml.feature.Imputer
load(String) - Static method in class org.apache.spark.ml.feature.ImputerModel
load(String) - Static method in class org.apache.spark.ml.feature.IndexToString
load(String) - Static method in class org.apache.spark.ml.feature.Interaction
load(String) - Static method in class org.apache.spark.ml.feature.MaxAbsScaler
load(String) - Static method in class org.apache.spark.ml.feature.MaxAbsScalerModel
load(String) - Static method in class org.apache.spark.ml.feature.MinHashLSH
load(String) - Static method in class org.apache.spark.ml.feature.MinHashLSHModel
load(String) - Static method in class org.apache.spark.ml.feature.MinMaxScaler
load(String) - Static method in class org.apache.spark.ml.feature.MinMaxScalerModel
load(String) - Static method in class org.apache.spark.ml.feature.NGram
load(String) - Static method in class org.apache.spark.ml.feature.Normalizer
load(String) - Static method in class org.apache.spark.ml.feature.OneHotEncoder: Deprecated.
load(String) - Static method in class org.apache.spark.ml.feature.OneHotEncoderEstimator
load(String) - Static method in class org.apache.spark.ml.feature.OneHotEncoderModel
load(String) - Static method in class org.apache.spark.ml.feature.PCA
load(String) - Static method in class org.apache.spark.ml.feature.PCAModel
load(String) - Static method in class org.apache.spark.ml.feature.PolynomialExpansion
load(String) - Static method in class org.apache.spark.ml.feature.QuantileDiscretizer
load(String) - Static method in class org.apache.spark.ml.feature.RegexTokenizer
load(String) - Static method in class org.apache.spark.ml.feature.RFormula
load(String) - Static method in class org.apache.spark.ml.feature.RFormulaModel
load(String) - Static method in class org.apache.spark.ml.feature.SQLTransformer
load(String) - Static method in class org.apache.spark.ml.feature.StandardScaler
load(String) - Static method in class org.apache.spark.ml.feature.StandardScalerModel
load(String) - Static method in class org.apache.spark.ml.feature.StopWordsRemover
load(String) - Static method in class org.apache.spark.ml.feature.StringIndexer
load(String) - Static method in class org.apache.spark.ml.feature.StringIndexerModel
load(String) - Static method in class org.apache.spark.ml.feature.Tokenizer
load(String) - Static method in class org.apache.spark.ml.feature.VectorAssembler
load(String) - Static method in class org.apache.spark.ml.feature.VectorAttributeRewriter
load(String) - Static method in class org.apache.spark.ml.feature.VectorIndexer
load(String) - Static method in class org.apache.spark.ml.feature.VectorIndexerModel
load(String) - Static method in class org.apache.spark.ml.feature.VectorSizeHint
load(String) - Static method in class org.apache.spark.ml.feature.VectorSlicer
load(String) - Static method in class org.apache.spark.ml.feature.Word2Vec
load(String) - Static method in class org.apache.spark.ml.feature.Word2VecModel
load(String) - Static method in class org.apache.spark.ml.fpm.FPGrowth
load(String) - Static method in class org.apache.spark.ml.fpm.FPGrowthModel
load(String) - Static method in class org.apache.spark.ml.Pipeline
load(String, SparkContext, String) - Method in class org.apache.spark.ml.Pipeline.SharedReadWrite$: Load metadata and stages for a Pipeline or PipelineModel
load(String) - Static method in class org.apache.spark.ml.PipelineModel
load(String) - Static method in class org.apache.spark.ml.r.RWrappers
load(String) - Static method in class org.apache.spark.ml.recommendation.ALS
load(String) - Static method in class org.apache.spark.ml.recommendation.ALSModel
load(String) - Static method in class org.apache.spark.ml.regression.AFTSurvivalRegression
load(String) - Static method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
load(String) - Static method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
load(String) - Static method in class org.apache.spark.ml.regression.DecisionTreeRegressor
load(String) - Static method in class org.apache.spark.ml.regression.GBTRegressionModel
load(String) - Static method in class org.apache.spark.ml.regression.GBTRegressor
load(String) - Static method in class org.apache.spark.ml.regression.GeneralizedLinearRegression
load(String) - Static method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionModel
load(String) - Static method in class org.apache.spark.ml.regression.IsotonicRegression
load(String) - Static method in class org.apache.spark.ml.regression.IsotonicRegressionModel
load(String) - Static method in class org.apache.spark.ml.regression.LinearRegression
load(String) - Static method in class org.apache.spark.ml.regression.LinearRegressionModel
load(String) - Static method in class org.apache.spark.ml.regression.RandomForestRegressionModel
load(String) - Static method in class org.apache.spark.ml.regression.RandomForestRegressor
load(String) - Static method in class org.apache.spark.ml.tuning.CrossValidator
load(String) - Static method in class org.apache.spark.ml.tuning.CrossValidatorModel
load(String) - Static method in class org.apache.spark.ml.tuning.TrainValidationSplit
load(String) - Static method in class org.apache.spark.ml.tuning.TrainValidationSplitModel
load(String) - Method in interface org.apache.spark.ml.util.MLReadable: Reads an ML instance from the input path, a shortcut of read.load(path).
load(String) - Method in class org.apache.spark.ml.util.MLReader: Loads the ML component from the input path.
load(SparkContext, String) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionModel
load(SparkContext, String) - Static method in class org.apache.spark.mllib.classification.NaiveBayesModel
load(SparkContext, String) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV1_0$
load(SparkContext, String) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV2_0$
load(SparkContext, String) - Static method in class org.apache.spark.mllib.classification.SVMModel
load(SparkContext, String) - Static method in class org.apache.spark.mllib.clustering.BisectingKMeansModel
load(SparkContext, String) - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel.SaveLoadV1_0$
load(SparkContext, String) - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel.SaveLoadV2_0$
load(SparkContext, String) - Static method in class org.apache.spark.mllib.clustering.DistributedLDAModel
load(SparkContext, String) - Static method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
load(SparkContext, String) - Static method in class org.apache.spark.mllib.clustering.KMeansModel
load(SparkContext, String) - Method in class org.apache.spark.mllib.clustering.KMeansModel.SaveLoadV1_0$
load(SparkContext, String) - Method in class org.apache.spark.mllib.clustering.KMeansModel.SaveLoadV2_0$
load(SparkContext, String) - Static method in class org.apache.spark.mllib.clustering.LocalLDAModel
load(SparkContext, String) - Static method in class org.apache.spark.mllib.clustering.PowerIterationClusteringModel
load(SparkContext, String) - Method in class org.apache.spark.mllib.clustering.PowerIterationClusteringModel.SaveLoadV1_0$
load(SparkContext, String) - Static method in class org.apache.spark.mllib.feature.ChiSqSelectorModel
load(SparkContext, String) - Method in class org.apache.spark.mllib.feature.ChiSqSelectorModel.SaveLoadV1_0$
load(SparkContext, String) - Static method in class org.apache.spark.mllib.feature.Word2VecModel
load(SparkContext, String) - Static method in class org.apache.spark.mllib.fpm.FPGrowthModel
load(SparkContext, String) - Method in class org.apache.spark.mllib.fpm.FPGrowthModel.SaveLoadV1_0$
load(SparkContext, String) - Static method in class org.apache.spark.mllib.fpm.PrefixSpanModel
load(SparkContext, String) - Method in class org.apache.spark.mllib.fpm.PrefixSpanModel.SaveLoadV1_0$
load(SparkContext, String) - Static method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel: Load a model from the given path.
load(SparkContext, String) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel.SaveLoadV1_0$
load(SparkContext, String) - Static method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
load(SparkContext, String) - Static method in class org.apache.spark.mllib.regression.LassoModel
load(SparkContext, String) - Static method in class org.apache.spark.mllib.regression.LinearRegressionModel
load(SparkContext, String) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionModel
load(SparkContext, String) - Static method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
load(SparkContext, String, String, int) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$
load(SparkContext, String) - Static method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
load(SparkContext, String) - Static method in class org.apache.spark.mllib.tree.model.RandomForestModel
load(SparkContext, String) - Method in interface org.apache.spark.mllib.util.Loader: Load a model from the given path.
load(String...) - Method in class org.apache.spark.sql.DataFrameReader: Loads input in as a DataFrame, for data sources that support multiple paths.
load() - Method in class org.apache.spark.sql.DataFrameReader: Loads input in as a DataFrame, for data sources that don't require a path (e.g.
load(String) - Method in class org.apache.spark.sql.DataFrameReader: Loads input in as a DataFrame, for data sources that require a path (e.g.
load(Seq<String>) - Method in class org.apache.spark.sql.DataFrameReader: Loads input in as a DataFrame, for data sources that support multiple paths.
load(String) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
As of 1.4.0, replaced by read().load(path).
load(String, String) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
As of 1.4.0, replaced by read().format(source).load(path).
load(String, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
As of 1.4.0, replaced by read().format(source).options(options).load().
load(String, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
As of 1.4.0, replaced by read().format(source).options(options).load().
load(String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
As of 1.4.0, replaced by read().format(source).schema(schema).options(options).load().
load(String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
As of 1.4.0, replaced by read().format(source).schema(schema).options(options).load().
load() - Method in class org.apache.spark.sql.streaming.DataStreamReader: Loads input data stream in as a DataFrame, for data streams that don't require a path (e.g.
load(String) - Method in class org.apache.spark.sql.streaming.DataStreamReader: Loads input in as a DataFrame, for data streams that read from some path.
loadClass(String, boolean) - Method in class org.apache.spark.util.ChildFirstURLClassLoader
loadClass(String, boolean) - Method in class org.apache.spark.util.ParentClassLoader
loadData(SparkContext, String, String) - Method in class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$: Helper method for loading GLM classification model data.
loadData(SparkContext, String, String, int) - Method in class org.apache.spark.mllib.regression.impl.GLMRegressionModel.SaveLoadV1_0$: Helper method for loading GLM regression model data.
loadDefaultSparkProperties(SparkConf, String) - Static method in class org.apache.spark.util.Utils: Load default Spark properties from the given file.
loadDefaultStopWords(String) - Static method in class org.apache.spark.ml.feature.StopWordsRemover: Loads the default stop words for the given language.
loadDynamicPartitions(String, String, String, LinkedHashMap<String, String>, boolean, int) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Loads new dynamic partitions into an existing table.
Loader<M extends Saveable> - Interface in org.apache.spark.mllib.util: :: DeveloperApi ::
loadExtensions(Class<T>, Seq<String>, SparkConf) - Static method in class org.apache.spark.util.Utils: Create instances of extension classes.
loadImpl(String, SparkSession, String, String) - Static method in class org.apache.spark.ml.tree.EnsembleModelReadWrite: Helper method for loading a tree ensemble from disk.
loadImpl(Dataset<Row>, Item, ClassTag<Item>) - Method in class org.apache.spark.mllib.fpm.FPGrowthModel.SaveLoadV1_0$
loadImpl(Dataset<Row>, Item, ClassTag<Item>) - Method in class org.apache.spark.mllib.fpm.PrefixSpanModel.SaveLoadV1_0$
loadLabeledPoints(SparkContext, String, int) - Static method in class org.apache.spark.mllib.util.MLUtils: Loads labeled points saved using RDD[LabeledPoint].saveAsTextFile.
loadLabeledPoints(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils: Loads labeled points saved using RDD[LabeledPoint].saveAsTextFile with the default number of partitions.
loadLibSVMFile(SparkContext, String, int, int) - Static method in class org.apache.spark.mllib.util.MLUtils: Loads labeled data in the LIBSVM format into an RDD[LabeledPoint].
loadLibSVMFile(SparkContext, String, int) - Static method in class org.apache.spark.mllib.util.MLUtils: Loads labeled data in the LIBSVM format into an RDD[LabeledPoint], with the default number of partitions.
loadLibSVMFile(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils: Loads binary labeled data in the LIBSVM format into an RDD[LabeledPoint], with number of features determined automatically and the default number of partitions.
loadPartition(String, String, String, LinkedHashMap<String, String>, boolean, boolean, boolean) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Loads a static partition into an existing table.
loadTable(String, String, boolean, boolean) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Loads data into an existing table.
loadTreeNodes(String, org.apache.spark.ml.util.DefaultParamsReader.Metadata, SparkSession) - Static method in class org.apache.spark.ml.tree.DecisionTreeModelReadWrite: Load a decision tree from a file.
loadVectors(SparkContext, String, int) - Static method in class org.apache.spark.mllib.util.MLUtils: Loads vectors saved using RDD[Vector].saveAsTextFile.
loadVectors(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils: Loads vectors saved using RDD[Vector].saveAsTextFile with the default number of partitions.
LOCAL_BLOCKS_FETCHED() - Method in class org.apache.spark.InternalAccumulator.shuffleRead$
LOCAL_BYTES_READ() - Method in class org.apache.spark.InternalAccumulator.shuffleRead$
LOCAL_CLUSTER_REGEX() - Static method in class org.apache.spark.SparkMasterRegex
LOCAL_N_FAILURES_REGEX() - Static method in class org.apache.spark.SparkMasterRegex
LOCAL_N_REGEX() - Static method in class org.apache.spark.SparkMasterRegex
localBlocksFetched() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetricDistributions
localBlocksFetched() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetrics
localBytesRead() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetrics
localCanonicalHostName() - Static method in class org.apache.spark.util.Utils: Get the local machine's FQDN.
localCheckpoint() - Method in class org.apache.spark.rdd.RDD: Mark this RDD for local checkpointing using Spark's existing caching layer.
localCheckpoint() - Method in class org.apache.spark.sql.Dataset: Eagerly locally checkpoints a Dataset and return the new Dataset.
localCheckpoint(boolean) - Method in class org.apache.spark.sql.Dataset: Locally checkpoints a Dataset and return the new Dataset.
locale() - Method in class org.apache.spark.ml.feature.StopWordsRemover: Locale of the input for case insensitive matching.
localHostName() - Static method in class org.apache.spark.util.Utils: Get the local machine's hostname.
localHostNameForURI() - Static method in class org.apache.spark.util.Utils: Get the local machine's URI.
LOCALITY() - Static method in class org.apache.spark.status.TaskIndexNames
localityAwareTasks() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RequestExecutors
localitySummary() - Method in class org.apache.spark.status.LiveStage
LocalKMeans - Class in org.apache.spark.mllib.clustering: An utility object to run K-means locally.
LocalKMeans() - Constructor for class org.apache.spark.mllib.clustering.LocalKMeans
LocalLDAModel - Class in org.apache.spark.ml.clustering: Local (non-distributed) model fitted by LDA.
LocalLDAModel - Class in org.apache.spark.mllib.clustering: Local LDA model.
localSeqToDatasetHolder(Seq<T>, Encoder<T>) - Method in class org.apache.spark.sql.SQLImplicits: Creates a Dataset from a local Seq.
localSparkRPackagePath() - Static method in class org.apache.spark.api.r.RUtils: Get the SparkR package path in the local spark distribution.
localValue() - Method in class org.apache.spark.Accumulable: Deprecated.

Get the current value of this accumulator from within a task.
locate(String, Column) - Static method in class org.apache.spark.sql.functions: Locate the position of the first occurrence of substr.
locate(String, Column, int) - Static method in class org.apache.spark.sql.functions: Locate the position of the first occurrence of substr in a string column, after position pos.
location() - Method in interface org.apache.spark.scheduler.MapStatus: Location where this task was run.
location() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
location() - Method in class org.apache.spark.ui.storage.ExecutorStreamSummary
locations() - Method in class org.apache.spark.storage.BlockManagerMessages.BlockLocationsAndStatus
locationUri() - Method in class org.apache.spark.sql.catalog.Database
log() - Method in interface org.apache.spark.internal.Logging
log(Function0<Parsers.Parser<T>>, String) - Static method in class org.apache.spark.ml.feature.RFormulaParser
log(Column) - Static method in class org.apache.spark.sql.functions: Computes the natural logarithm of the given value.
log(String) - Static method in class org.apache.spark.sql.functions: Computes the natural logarithm of the given column.
log(double, Column) - Static method in class org.apache.spark.sql.functions: Returns the first argument-base logarithm of the second argument.
log(double, String) - Static method in class org.apache.spark.sql.functions: Returns the first argument-base logarithm of the second argument.
Log$() - Constructor for class org.apache.spark.ml.regression.GeneralizedLinearRegression.Log$
log10(Column) - Static method in class org.apache.spark.sql.functions: Computes the logarithm of the given value in base 10.
log10(String) - Static method in class org.apache.spark.sql.functions: Computes the logarithm of the given value in base 10.
log1p(Column) - Static method in class org.apache.spark.sql.functions: Computes the natural logarithm of the given value plus one.
log1p(String) - Static method in class org.apache.spark.sql.functions: Computes the natural logarithm of the given column plus one.
log2(Column) - Static method in class org.apache.spark.sql.functions: Computes the logarithm of the given column in base 2.
log2(String) - Static method in class org.apache.spark.sql.functions: Computes the logarithm of the given value in base 2.
log_() - Method in interface org.apache.spark.internal.Logging
logDebug(Function0<String>) - Method in interface org.apache.spark.internal.Logging
logDebug(Function0<String>, Throwable) - Method in interface org.apache.spark.internal.Logging
logDeprecationWarning(String) - Static method in class org.apache.spark.SparkConf: Logs a warning message if the given config key is deprecated.
logError(Function0<String>) - Method in interface org.apache.spark.internal.Logging
logError(Function0<String>, Throwable) - Method in interface org.apache.spark.internal.Logging
logEvent() - Method in interface org.apache.spark.scheduler.SparkListenerEvent
Logging - Interface in org.apache.spark.internal: Utility trait for classes that want to log data.
logInfo(Function0<String>) - Method in interface org.apache.spark.internal.Logging
logInfo(Function0<String>, Throwable) - Method in interface org.apache.spark.internal.Logging
LogisticGradient - Class in org.apache.spark.mllib.optimization: :: DeveloperApi :: Compute gradient and loss for a multinomial logistic loss function, as used in multi-class classification (it is also used in binary logistic regression).
LogisticGradient(int) - Constructor for class org.apache.spark.mllib.optimization.LogisticGradient
LogisticGradient() - Constructor for class org.apache.spark.mllib.optimization.LogisticGradient
LogisticRegression - Class in org.apache.spark.ml.classification: Logistic regression.
LogisticRegression(String) - Constructor for class org.apache.spark.ml.classification.LogisticRegression
LogisticRegression() - Constructor for class org.apache.spark.ml.classification.LogisticRegression
LogisticRegressionDataGenerator - Class in org.apache.spark.mllib.util: :: DeveloperApi :: Generate test data for LogisticRegression.
LogisticRegressionDataGenerator() - Constructor for class org.apache.spark.mllib.util.LogisticRegressionDataGenerator
LogisticRegressionModel - Class in org.apache.spark.ml.classification: Model produced by LogisticRegression.
LogisticRegressionModel - Class in org.apache.spark.mllib.classification: Classification model trained using Multinomial/Binary Logistic Regression.
LogisticRegressionModel(Vector, double, int, int) - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionModel
LogisticRegressionModel(Vector, double) - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionModel: Constructs a LogisticRegressionModel with weights and intercept for binary classification.
LogisticRegressionParams - Interface in org.apache.spark.ml.classification: Params for logistic regression.
LogisticRegressionSummary - Interface in org.apache.spark.ml.classification: :: Experimental :: Abstraction for logistic regression results for a given model.
LogisticRegressionSummaryImpl - Class in org.apache.spark.ml.classification: Multiclass logistic regression results for a given model.
LogisticRegressionSummaryImpl(Dataset<Row>, String, String, String, String) - Constructor for class org.apache.spark.ml.classification.LogisticRegressionSummaryImpl
LogisticRegressionTrainingSummary - Interface in org.apache.spark.ml.classification: :: Experimental :: Abstraction for multiclass logistic regression training results.
LogisticRegressionTrainingSummaryImpl - Class in org.apache.spark.ml.classification: Multiclass logistic regression training results.
LogisticRegressionTrainingSummaryImpl(Dataset<Row>, String, String, String, String, double[]) - Constructor for class org.apache.spark.ml.classification.LogisticRegressionTrainingSummaryImpl
LogisticRegressionWithLBFGS - Class in org.apache.spark.mllib.classification: Train a classification model for Multinomial/Binary Logistic Regression using Limited-memory BFGS.
LogisticRegressionWithLBFGS() - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionWithLBFGS
LogisticRegressionWithSGD - Class in org.apache.spark.mllib.classification: Train a classification model for Binary Logistic Regression using Stochastic Gradient Descent.
LogisticRegressionWithSGD() - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionWithSGD: Deprecated.
Use ml.classification.LogisticRegression or LogisticRegressionWithLBFGS. Since 2.0.0.
Logit$() - Constructor for class org.apache.spark.ml.regression.GeneralizedLinearRegression.Logit$
logLikelihood() - Method in class org.apache.spark.ml.clustering.ExpectationAggregator
logLikelihood() - Method in class org.apache.spark.ml.clustering.GaussianMixtureSummary
logLikelihood(Dataset<?>) - Method in class org.apache.spark.ml.clustering.LDAModel: Calculates a lower bound on the log likelihood of the entire corpus.
logLikelihood() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel: Log likelihood of the observed tokens in the training set, given the current parameter estimates: log P(docs | topics, topic distributions for docs, alpha, eta)
logLikelihood() - Method in class org.apache.spark.mllib.clustering.ExpectationSum
logLikelihood(RDD<Tuple2<Object, Vector>>) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel: Calculates a lower bound on the log likelihood of the entire corpus.
logLikelihood(JavaPairRDD<Long, Vector>) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel: Java-friendly version of logLikelihood
LogLoss - Class in org.apache.spark.mllib.tree.loss: :: DeveloperApi :: Class for log loss calculation (for classification).
LogLoss() - Constructor for class org.apache.spark.mllib.tree.loss.LogLoss
logName() - Method in interface org.apache.spark.internal.Logging
LogNormalGenerator - Class in org.apache.spark.mllib.random: :: DeveloperApi :: Generates i.i.d.
LogNormalGenerator(double, double) - Constructor for class org.apache.spark.mllib.random.LogNormalGenerator
logNormalGraph(SparkContext, int, int, double, double, long) - Static method in class org.apache.spark.graphx.util.GraphGenerators: Generate a graph whose vertex out degree distribution is log normal.
logNormalJavaRDD(JavaSparkContext, double, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Java-friendly version of RandomRDDs.logNormalRDD.
logNormalJavaRDD(JavaSparkContext, double, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.logNormalJavaRDD with the default seed.
logNormalJavaRDD(JavaSparkContext, double, double, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.logNormalJavaRDD with the default number of partitions and the default seed.
logNormalJavaVectorRDD(JavaSparkContext, double, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Java-friendly version of RandomRDDs.logNormalVectorRDD.
logNormalJavaVectorRDD(JavaSparkContext, double, double, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.logNormalJavaVectorRDD with the default seed.
logNormalJavaVectorRDD(JavaSparkContext, double, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.logNormalJavaVectorRDD with the default number of partitions and the default seed.
logNormalRDD(SparkContext, double, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Generates an RDD comprised of i.i.d. samples from the log normal distribution with the input mean and standard deviation
logNormalVectorRDD(SparkContext, double, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Generates an RDD[Vector] with vectors containing i.i.d. samples drawn from a log normal distribution.
logpdf(Vector) - Method in class org.apache.spark.ml.stat.distribution.MultivariateGaussian: Returns the log-density of this multivariate Gaussian at given point, x
logpdf(Vector) - Method in class org.apache.spark.mllib.stat.distribution.MultivariateGaussian: Returns the log-density of this multivariate Gaussian at given point, x
logPerplexity(Dataset<?>) - Method in class org.apache.spark.ml.clustering.LDAModel: Calculate an upper bound on perplexity.
logPerplexity(RDD<Tuple2<Object, Vector>>) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel: Calculate an upper bound on perplexity.
logPerplexity(JavaPairRDD<Long, Vector>) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel: Java-friendly version of logPerplexity
logPrior() - Method in class org.apache.spark.ml.clustering.DistributedLDAModel: Log probability of the current parameter estimate: log P(topics, topic distributions for docs | Dirichlet hyperparameters)
logPrior() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel: Log probability of the current parameter estimate: log P(topics, topic distributions for docs | alpha, eta)
logStartFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
logStartToJson(SparkListenerLogStart) - Static method in class org.apache.spark.util.JsonProtocol
logTrace(Function0<String>) - Method in interface org.apache.spark.internal.Logging
logTrace(Function0<String>, Throwable) - Method in interface org.apache.spark.internal.Logging
logTuningParams(org.apache.spark.ml.util.Instrumentation) - Method in interface org.apache.spark.ml.tuning.ValidatorParams: Instrumentation logging for tuning params including the inner estimator and evaluator info.
logUncaughtExceptions(Function0<T>) - Static method in class org.apache.spark.util.Utils: Execute the given block, logging and re-throwing any uncaught exception.
logUrlMap() - Method in class org.apache.spark.scheduler.cluster.ExecutorInfo
logUrls() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor
logWarning(Function0<String>) - Method in interface org.apache.spark.internal.Logging
logWarning(Function0<String>, Throwable) - Method in interface org.apache.spark.internal.Logging
LONG() - Static method in class org.apache.spark.sql.Encoders: An encoder for nullable long type.
longAccumulator() - Method in class org.apache.spark.SparkContext: Create and register a long accumulator, which starts with 0 and accumulates inputs by add.
longAccumulator(String) - Method in class org.apache.spark.SparkContext: Create and register a long accumulator, which starts with 0 and accumulates inputs by add.
LongAccumulator - Class in org.apache.spark.util: An accumulator for computing sum, count, and average of 64-bit integers.
LongAccumulator() - Constructor for class org.apache.spark.util.LongAccumulator
LongAccumulatorParam$() - Constructor for class org.apache.spark.AccumulatorParam.LongAccumulatorParam$: Deprecated.
LongParam - Class in org.apache.spark.ml.param: :: DeveloperApi :: Specialized version of Param[Long] for Java.
LongParam(String, String, String, Function1<Object, Object>) - Constructor for class org.apache.spark.ml.param.LongParam
LongParam(String, String, String) - Constructor for class org.apache.spark.ml.param.LongParam
LongParam(Identifiable, String, String, Function1<Object, Object>) - Constructor for class org.apache.spark.ml.param.LongParam
LongParam(Identifiable, String, String) - Constructor for class org.apache.spark.ml.param.LongParam
LongType - Static variable in class org.apache.spark.sql.types.DataTypes: Gets the LongType object.
LongType - Class in org.apache.spark.sql.types: The data type representing Long values.
LongType() - Constructor for class org.apache.spark.sql.types.LongType
lookup(K) - Method in class org.apache.spark.api.java.JavaPairRDD: Return the list of values in the RDD for key key.
lookup(K) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return the list of values in the RDD for key key.
lookupRpcTimeout(SparkConf) - Static method in class org.apache.spark.util.RpcUtils: Returns the default Spark timeout to use for RPC remote endpoint lookup.
loss(DenseMatrix<Object>, DenseMatrix<Object>, DenseMatrix<Object>) - Method in interface org.apache.spark.ml.ann.LossFunction: Returns the value of loss function.
loss() - Method in interface org.apache.spark.ml.optim.aggregator.DifferentiableLossAggregator: The current loss value of this aggregator.
loss() - Method in interface org.apache.spark.ml.param.shared.HasLoss: Param for the loss function to be optimized.
loss() - Method in class org.apache.spark.ml.regression.AFTAggregator
loss() - Method in interface org.apache.spark.ml.regression.LinearRegressionParams: The loss function to be optimized.
loss() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
Loss - Interface in org.apache.spark.mllib.tree.loss: :: DeveloperApi :: Trait for adding "pluggable" loss functions for the gradient boosting algorithm.
Losses - Class in org.apache.spark.mllib.tree.loss
Losses() - Constructor for class org.apache.spark.mllib.tree.loss.Losses
LossFunction - Interface in org.apache.spark.ml.ann: Trait for loss function
LossReasonPending - Class in org.apache.spark.scheduler: A loss reason that means we don't yet know why the executor exited.
LossReasonPending() - Constructor for class org.apache.spark.scheduler.LossReasonPending
lossSum() - Method in interface org.apache.spark.ml.optim.aggregator.DifferentiableLossAggregator
lossType() - Method in interface org.apache.spark.ml.tree.GBTClassifierParams: Loss function which GBT tries to minimize.
lossType() - Method in interface org.apache.spark.ml.tree.GBTRegressorParams: Loss function which GBT tries to minimize.
LOST() - Static method in class org.apache.spark.TaskState
low() - Method in class org.apache.spark.partial.BoundedDouble
lower(Column) - Static method in class org.apache.spark.sql.functions: Converts a string column to lower case.
lowerBoundsOnCoefficients() - Method in interface org.apache.spark.ml.classification.LogisticRegressionParams: The lower bounds on coefficients if fitting under bound constrained optimization.
lowerBoundsOnIntercepts() - Method in interface org.apache.spark.ml.classification.LogisticRegressionParams: The lower bounds on intercepts if fitting under bound constrained optimization.
LowPrioritySQLImplicits - Interface in org.apache.spark.sql: Lower priority implicit methods for converting Scala objects into Datasets.
lpad(Column, int, String) - Static method in class org.apache.spark.sql.functions: Left-pad the string column with pad to a length of len.
LSHParams - Interface in org.apache.spark.ml.feature: Params for LSH.
lt(double) - Static method in class org.apache.spark.ml.param.ParamValidators: Check if value is less than upperBound
lt(Object) - Method in class org.apache.spark.sql.Column: Less than.
ltEq(double) - Static method in class org.apache.spark.ml.param.ParamValidators: Check if value is less than or equal to upperBound
ltrim(Column) - Static method in class org.apache.spark.sql.functions: Trim the spaces from left end for the specified string value.
ltrim(Column, String) - Static method in class org.apache.spark.sql.functions: Trim the specified character string from left end for the specified string column.
LZ4CompressionCodec - Class in org.apache.spark.io: :: DeveloperApi :: LZ4 implementation of CompressionCodec.
LZ4CompressionCodec(SparkConf) - Constructor for class org.apache.spark.io.LZ4CompressionCodec
LZFCompressionCodec - Class in org.apache.spark.io: :: DeveloperApi :: LZF implementation of CompressionCodec.
LZFCompressionCodec(SparkConf) - Constructor for class org.apache.spark.io.LZFCompressionCodec

M

main(String[]) - Static method in class org.apache.spark.ml.param.shared.SharedParamsCodeGen
main(String[]) - Static method in class org.apache.spark.mllib.util.KMeansDataGenerator
main(String[]) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
main(String[]) - Static method in class org.apache.spark.mllib.util.LogisticRegressionDataGenerator
main(String[]) - Static method in class org.apache.spark.mllib.util.MFDataGenerator
main(String[]) - Static method in class org.apache.spark.mllib.util.SVMDataGenerator
main(String[]) - Static method in class org.apache.spark.streaming.util.RawTextSender
main(String[]) - Static method in class org.apache.spark.ui.UIWorkloadGenerator
main(String[]) - Method in interface org.apache.spark.util.CommandLineUtils
majorMinorVersion(String) - Static method in class org.apache.spark.util.VersionUtils: Given a Spark version string, return the (major version number, minor version number).
majorVersion(String) - Static method in class org.apache.spark.util.VersionUtils: Given a Spark version string, return the major version number.
makeBinarySearch(Ordering<K>, ClassTag<K>) - Static method in class org.apache.spark.util.CollectionsUtils
makeDescription(String, String, boolean) - Static method in class org.apache.spark.ui.UIUtils: Returns HTML rendering of a job or stage description.
makeDriverRef(String, SparkConf, org.apache.spark.rpc.RpcEnv) - Static method in class org.apache.spark.util.RpcUtils: Retrieve a RpcEndpointRef which is located in the driver via its name.
makeHref(boolean, String, String) - Static method in class org.apache.spark.ui.UIUtils: Return the correct Href after checking if master is running in the reverse proxy mode or not.
makeProgressBar(int, int, int, int, Map<String, Object>, int) - Static method in class org.apache.spark.ui.UIUtils
makeRDD(Seq<T>, int, ClassTag<T>) - Method in class org.apache.spark.SparkContext: Distribute a local Scala collection to form an RDD.
makeRDD(Seq<Tuple2<T, Seq<String>>>, ClassTag<T>) - Method in class org.apache.spark.SparkContext: Distribute a local Scala collection to form an RDD, with one or more location preferences (hostnames of Spark nodes) for each object.
makeRDDForPartitionedTable(Seq<Partition>) - Method in interface org.apache.spark.sql.hive.TableReader
makeRDDForTable(Table) - Method in interface org.apache.spark.sql.hive.TableReader
map(Function<T, R>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to all elements of this RDD.
map(Function1<Object, Object>) - Method in interface org.apache.spark.ml.linalg.Matrix: Map the values of this matrix using a function.
map(Function1<Object, Object>) - Method in interface org.apache.spark.mllib.linalg.Matrix: Map the values of this matrix using a function.
map(Function1<R, T>) - Method in class org.apache.spark.partial.PartialResult: Transform this PartialResult into a PartialResult of type T.
map(Function1<T, U>, ClassTag) - Method in class org.apache.spark.rdd.RDD: Return a new RDD by applying a function to all elements of this RDD.
map(DataType, DataType) - Method in class org.apache.spark.sql.ColumnName: Creates a new StructField of type map.
map(MapType) - Method in class org.apache.spark.sql.ColumnName
map(Function1<T, U>, Encoder) - Method in class org.apache.spark.sql.Dataset: :: Experimental :: (Scala-specific) Returns a new Dataset that contains the result of applying func to each element.
map(MapFunction<T, U>, Encoder) - Method in class org.apache.spark.sql.Dataset: :: Experimental :: (Java-specific) Returns a new Dataset that contains the result of applying func to each element.
map(Column...) - Static method in class org.apache.spark.sql.functions: Creates a new map column.
map(Seq<Column>) - Static method in class org.apache.spark.sql.functions: Creates a new map column.
map(Function<T, U>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream by applying a function to all elements of this DStream.
map(Function1<T, U>, ClassTag) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream by applying a function to all elements of this DStream.
map_concat(Column...) - Static method in class org.apache.spark.sql.functions: Returns the union of all the given maps.
map_concat(Seq<Column>) - Static method in class org.apache.spark.sql.functions: Returns the union of all the given maps.
map_from_arrays(Column, Column) - Static method in class org.apache.spark.sql.functions: Creates a new map column.
map_from_entries(Column) - Static method in class org.apache.spark.sql.functions: Returns a map created from the given array of entries.
map_keys(Column) - Static method in class org.apache.spark.sql.functions: Returns an unordered array containing the keys of the map.
map_values(Column) - Static method in class org.apache.spark.sql.functions: Returns an unordered array containing the values of the map.
mapAsSerializableJavaMap(Map<A, B>) - Static method in class org.apache.spark.api.java.JavaUtils
mapEdgePartitions(Function2<Object, EdgePartition<ED, VD>, EdgePartition<ED2, VD2>>, ClassTag<ED2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
mapEdges(Function1<Edge<ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph: Transforms each edge attribute in the graph using the map function.
mapEdges(Function2<Object, Iterator<Edge<ED>>, Iterator<ED2>>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph: Transforms each edge attribute using the map function, passing it a whole partition at a time.
mapEdges(Function2<Object, Iterator<Edge<ED>>, Iterator<ED2>>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
mapFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol: -------------------------------- * Util JSON deserialization methods |
MapFunction<T,U> - Interface in org.apache.spark.api.java.function: Base interface for a map function used in Dataset's map function.
mapGroups(Function2<K, Iterator<V>, U>, Encoder) - Method in class org.apache.spark.sql.KeyValueGroupedDataset: (Scala-specific) Applies the given function to each group of data.
mapGroups(MapGroupsFunction<K, V, U>, Encoder) - Method in class org.apache.spark.sql.KeyValueGroupedDataset: (Java-specific) Applies the given function to each group of data.
MapGroupsFunction<K,V,R> - Interface in org.apache.spark.api.java.function: Base interface for a map function used in GroupedDataset's mapGroup function.
mapGroupsWithState(Function3<K, Iterator<V>, GroupState<S>, U>, Encoder<S>, Encoder) - Method in class org.apache.spark.sql.KeyValueGroupedDataset: ::Experimental:: (Scala-specific) Applies the given function to each group of data, while maintaining a user-defined per-group state.
mapGroupsWithState(GroupStateTimeout, Function3<K, Iterator<V>, GroupState<S>, U>, Encoder<S>, Encoder) - Method in class org.apache.spark.sql.KeyValueGroupedDataset: ::Experimental:: (Scala-specific) Applies the given function to each group of data, while maintaining a user-defined per-group state.
mapGroupsWithState(MapGroupsWithStateFunction<K, V, S, U>, Encoder<S>, Encoder) - Method in class org.apache.spark.sql.KeyValueGroupedDataset: ::Experimental:: (Java-specific) Applies the given function to each group of data, while maintaining a user-defined per-group state.
mapGroupsWithState(MapGroupsWithStateFunction<K, V, S, U>, Encoder<S>, Encoder, GroupStateTimeout) - Method in class org.apache.spark.sql.KeyValueGroupedDataset: ::Experimental:: (Java-specific) Applies the given function to each group of data, while maintaining a user-defined per-group state.
MapGroupsWithStateFunction<K,V,S,R> - Interface in org.apache.spark.api.java.function: ::Experimental:: Base interface for a map function used in KeyValueGroupedDataset.mapGroupsWithState( MapGroupsWithStateFunction, org.apache.spark.sql.Encoder, org.apache.spark.sql.Encoder)
mapId() - Method in class org.apache.spark.FetchFailed
mapId() - Method in class org.apache.spark.storage.ShuffleBlockId
mapId() - Method in class org.apache.spark.storage.ShuffleDataBlockId
mapId() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
mapOutputTracker() - Method in class org.apache.spark.SparkEnv
MapOutputTrackerMessage - Interface in org.apache.spark
mapPartitions(FlatMapFunction<Iterator<T>, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to each partition of this RDD.
mapPartitions(FlatMapFunction<Iterator<T>, U>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to each partition of this RDD.
mapPartitions(Function1<Iterator<T>, Iterator>, boolean, ClassTag) - Method in class org.apache.spark.rdd.RDD: Return a new RDD by applying a function to each partition of this RDD.
mapPartitions(Function1<Iterator<T>, Iterator<S>>, boolean, ClassTag<S>) - Method in class org.apache.spark.rdd.RDDBarrier: :: Experimental :: Returns a new RDD by applying a function to each partition of the wrapped RDD, where tasks are launched together in a barrier stage.
mapPartitions(Function1<Iterator<T>, Iterator>, Encoder) - Method in class org.apache.spark.sql.Dataset: :: Experimental :: (Scala-specific) Returns a new Dataset that contains the result of applying func to each partition.
mapPartitions(MapPartitionsFunction<T, U>, Encoder) - Method in class org.apache.spark.sql.Dataset: :: Experimental :: (Java-specific) Returns a new Dataset that contains the result of applying f to each partition.
mapPartitions(FlatMapFunction<Iterator<T>, U>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs of this DStream.
mapPartitions(Function1<Iterator<T>, Iterator>, boolean, ClassTag) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs of this DStream.
MapPartitionsFunction<T,U> - Interface in org.apache.spark.api.java.function: Base interface for function used in Dataset's mapPartitions.
mapPartitionsToDouble(DoubleFlatMapFunction<Iterator<T>>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to each partition of this RDD.
mapPartitionsToDouble(DoubleFlatMapFunction<Iterator<T>>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to each partition of this RDD.
mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to each partition of this RDD.
mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to each partition of this RDD.
mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs of this DStream.
mapPartitionsWithIndex(Function2<Integer, Iterator<T>, Iterator<R>>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to each partition of this RDD, while tracking the index of the original partition.
mapPartitionsWithIndex(Function2<Object, Iterator<T>, Iterator>, boolean, ClassTag) - Method in class org.apache.spark.rdd.RDD: Return a new RDD by applying a function to each partition of this RDD, while tracking the index of the original partition.
mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator<R>>, boolean) - Method in class org.apache.spark.api.java.JavaHadoopRDD: Maps over a partition, providing the InputSplit that was used as the base of the partition.
mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator<R>>, boolean) - Method in class org.apache.spark.api.java.JavaNewHadoopRDD: Maps over a partition, providing the InputSplit that was used as the base of the partition.
mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator>, boolean, ClassTag) - Method in class org.apache.spark.rdd.HadoopRDD: Maps over a partition, providing the InputSplit that was used as the base of the partition.
mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator>, boolean, ClassTag) - Method in class org.apache.spark.rdd.NewHadoopRDD: Maps over a partition, providing the InputSplit that was used as the base of the partition.
mapredInputFormat() - Method in class org.apache.spark.scheduler.InputFormatInfo
mapreduceInputFormat() - Method in class org.apache.spark.scheduler.InputFormatInfo
mapSideCombine() - Method in class org.apache.spark.ShuffleDependency
MapStatus - Interface in org.apache.spark.scheduler: Result returned by a ShuffleMapTask to a scheduler.
mapStatuses() - Method in class org.apache.spark.ShuffleStatus: MapStatus for each partition.
mapToDouble(DoubleFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to all elements of this RDD.
mapToJson(Map<String, String>) - Static method in class org.apache.spark.util.JsonProtocol: ------------------------------ * Util JSON serialization methods |
mapToPair(PairFunction<T, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to all elements of this RDD.
mapToPair(PairFunction<T, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream by applying a function to all elements of this DStream.
mapTriplets(Function1<EdgeTriplet<VD, ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph: Transforms each edge attribute using the map function, passing it the adjacent vertex attributes as well.
mapTriplets(Function1<EdgeTriplet<VD, ED>, ED2>, TripletFields, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph: Transforms each edge attribute using the map function, passing it the adjacent vertex attributes as well.
mapTriplets(Function2<Object, Iterator<EdgeTriplet<VD, ED>>, Iterator<ED2>>, TripletFields, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph: Transforms each edge attribute a partition at a time using the map function, passing it the adjacent vertex attributes as well.
mapTriplets(Function2<Object, Iterator<EdgeTriplet<VD, ED>>, Iterator<ED2>>, TripletFields, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
MapType - Class in org.apache.spark.sql.types: The data type for Maps.
MapType(DataType, DataType, boolean) - Constructor for class org.apache.spark.sql.types.MapType
MapType() - Constructor for class org.apache.spark.sql.types.MapType: No-arg constructor for kryo.
mapValues(Function<V, U>) - Method in class org.apache.spark.api.java.JavaPairRDD: Pass each value in the key-value pair RDD through a map function without changing the keys; this also retains the original RDD's partitioning.
mapValues(Function1<Edge<ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.EdgeRDD: Map the values in an edge partitioning preserving the structure but changing the values.
mapValues(Function1<Edge<ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
mapValues(Function1<VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
mapValues(Function2<Object, VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
mapValues(Function1<VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD: Maps each vertex attribute, preserving the index.
mapValues(Function2<Object, VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD: Maps each vertex attribute, additionally supplying the vertex ID.
mapValues(Function1<V, U>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Pass each value in the key-value pair RDD through a map function without changing the keys; this also retains the original RDD's partitioning.
mapValues(Function1<V, W>, Encoder<W>) - Method in class org.apache.spark.sql.KeyValueGroupedDataset: Returns a new KeyValueGroupedDataset where the given function func has been applied to the data.
mapValues(MapFunction<V, W>, Encoder<W>) - Method in class org.apache.spark.sql.KeyValueGroupedDataset: Returns a new KeyValueGroupedDataset where the given function func has been applied to the data.
mapValues(Function<V, U>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying a map function to the value of each key-value pairs in 'this' DStream without changing the key.
mapValues(Function1<V, U>, ClassTag) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying a map function to the value of each key-value pairs in 'this' DStream without changing the key.
mapVertices(Function2<Object, VD, VD2>, ClassTag<VD2>, Predef.$eq$colon$eq<VD, VD2>) - Method in class org.apache.spark.graphx.Graph: Transforms each vertex attribute in the graph using the map function.
mapVertices(Function2<Object, VD, VD2>, ClassTag<VD2>, Predef.$eq$colon$eq<VD, VD2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
mapWithState(StateSpec<K, V, StateType, MappedType>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: :: Experimental :: Return a JavaMapWithStateDStream by applying a function to every key-value element of this stream, while maintaining some state data for each unique key.
mapWithState(StateSpec<K, V, StateType, MappedType>, ClassTag<StateType>, ClassTag<MappedType>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: :: Experimental :: Return a MapWithStateDStream by applying a function to every key-value element of this stream, while maintaining some state data for each unique key.
MapWithStateDStream<KeyType,ValueType,StateType,MappedType> - Class in org.apache.spark.streaming.dstream: :: Experimental :: DStream representing the stream of data generated by mapWithState operation on a pair DStream.
MapWithStateDStream(StreamingContext, ClassTag<MappedType>) - Constructor for class org.apache.spark.streaming.dstream.MapWithStateDStream
mark(int) - Method in class org.apache.spark.storage.BufferReleasingInputStream
markSupported() - Method in class org.apache.spark.storage.BufferReleasingInputStream
mask(Graph<VD2, ED2>, ClassTag<VD2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph: Restricts the graph to only the vertices and edges that are also in other, but keeps the attributes from this graph.
mask(Graph<VD2, ED2>, ClassTag<VD2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
master() - Method in class org.apache.spark.api.java.JavaSparkContext
master() - Method in class org.apache.spark.SparkContext
master(String) - Method in class org.apache.spark.sql.SparkSession.Builder: Sets the Spark master URL to connect to, such as "local" to run locally, "local[4]" to run locally with 4 cores, or "spark://master:7077" to run on a Spark standalone cluster.
Matrices - Class in org.apache.spark.ml.linalg: Factory methods for Matrix.
Matrices() - Constructor for class org.apache.spark.ml.linalg.Matrices
Matrices - Class in org.apache.spark.mllib.linalg: Factory methods for Matrix.
Matrices() - Constructor for class org.apache.spark.mllib.linalg.Matrices
Matrix - Interface in org.apache.spark.ml.linalg: Trait for a local matrix.
Matrix - Interface in org.apache.spark.mllib.linalg: Trait for a local matrix.
MatrixEntry - Class in org.apache.spark.mllib.linalg.distributed: Represents an entry in a distributed matrix.
MatrixEntry(long, long, double) - Constructor for class org.apache.spark.mllib.linalg.distributed.MatrixEntry
MatrixFactorizationModel - Class in org.apache.spark.mllib.recommendation: Model representing the result of matrix factorization.
MatrixFactorizationModel(int, RDD<Tuple2<Object, double[]>>, RDD<Tuple2<Object, double[]>>) - Constructor for class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
MatrixFactorizationModel.SaveLoadV1_0$ - Class in org.apache.spark.mllib.recommendation
MatrixImplicits - Class in org.apache.spark.mllib.linalg: Implicit methods available in Scala for converting Matrix to Matrix and vice versa.
MatrixImplicits() - Constructor for class org.apache.spark.mllib.linalg.MatrixImplicits
MatrixType() - Static method in class org.apache.spark.ml.linalg.SQLDataTypes: Data type for Matrix.
max() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Returns the maximum element from this RDD as defined by the default comparator natural order.
max(Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Returns the maximum element from this RDD as defined by the specified Comparator[T].
MAX() - Static method in class org.apache.spark.ml.attribute.AttributeKeys
max() - Method in class org.apache.spark.ml.attribute.NumericAttribute
max() - Method in interface org.apache.spark.ml.feature.MinMaxScalerParams: upper bound after transformation, shared by all features Default: 1.0
max(Column, Column) - Static method in class org.apache.spark.ml.stat.Summarizer
max(Column) - Static method in class org.apache.spark.ml.stat.Summarizer
max() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer: Maximum value of each dimension.
max() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary: Maximum value of each column.
max(Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Returns the max of this RDD as defined by the implicit Ordering[T].
max(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the maximum value of the expression in a group.
max(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the maximum value of the column in a group.
max(String...) - Method in class org.apache.spark.sql.RelationalGroupedDataset: Compute the max value for each numeric columns for each group.
max(Seq<String>) - Method in class org.apache.spark.sql.RelationalGroupedDataset: Compute the max value for each numeric columns for each group.
max(Duration) - Method in class org.apache.spark.streaming.Duration
max(Time) - Method in class org.apache.spark.streaming.Time
max(long, long) - Static method in class org.apache.spark.streaming.util.RawTextHelper
max() - Method in class org.apache.spark.util.StatCounter
MAX_FEATURES_FOR_NORMAL_SOLVER() - Static method in class org.apache.spark.ml.regression.LinearRegression: When using LinearRegression.solver == "normal", the solver must limit the number of features to at most this number.
MAX_INT_DIGITS() - Static method in class org.apache.spark.sql.types.Decimal: Maximum number of decimal digits an Int can represent
MAX_LONG_DIGITS() - Static method in class org.apache.spark.sql.types.Decimal: Maximum number of decimal digits a Long can represent
MAX_PRECISION() - Static method in class org.apache.spark.sql.types.DecimalType
MAX_RETAINED_DEAD_EXECUTORS() - Static method in class org.apache.spark.status.config
MAX_RETAINED_JOBS() - Static method in class org.apache.spark.status.config
MAX_RETAINED_ROOT_NODES() - Static method in class org.apache.spark.status.config
MAX_RETAINED_STAGES() - Static method in class org.apache.spark.status.config
MAX_RETAINED_TASKS_PER_STAGE() - Static method in class org.apache.spark.status.config
MAX_SCALE() - Static method in class org.apache.spark.sql.types.DecimalType
maxAbs() - Method in class org.apache.spark.ml.feature.MaxAbsScalerModel
MaxAbsScaler - Class in org.apache.spark.ml.feature: Rescale each feature individually to range [-1, 1] by dividing through the largest maximum absolute value in each feature.
MaxAbsScaler(String) - Constructor for class org.apache.spark.ml.feature.MaxAbsScaler
MaxAbsScaler() - Constructor for class org.apache.spark.ml.feature.MaxAbsScaler
MaxAbsScalerModel - Class in org.apache.spark.ml.feature: Model fitted by MaxAbsScaler.
MaxAbsScalerParams - Interface in org.apache.spark.ml.feature: Params for MaxAbsScaler and MaxAbsScalerModel.
maxBins() - Method in interface org.apache.spark.ml.tree.DecisionTreeParams: Maximum number of bins used for discretizing continuous features and for choosing how to split on features at each node.
maxBins() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
maxBufferSizeMb() - Method in class org.apache.spark.serializer.KryoSerializer
maxCategories() - Method in interface org.apache.spark.ml.feature.VectorIndexerParams: Threshold for the number of values a categorical feature can take.
maxCores() - Method in class org.apache.spark.status.api.v1.ApplicationInfo
maxDepth() - Method in interface org.apache.spark.ml.tree.DecisionTreeParams: Maximum depth of the tree (nonnegative).
maxDepth() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
maxDF() - Method in interface org.apache.spark.ml.feature.CountVectorizerParams: Specifies the maximum number of different documents a term could appear in to be included in the vocabulary.
maxId() - Static method in class org.apache.spark.mllib.tree.configuration.Algo
maxId() - Static method in class org.apache.spark.mllib.tree.configuration.EnsembleCombiningStrategy
maxId() - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
maxId() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
maxId() - Static method in class org.apache.spark.rdd.CheckpointState
maxId() - Static method in class org.apache.spark.rdd.DeterministicLevel
maxId() - Static method in class org.apache.spark.scheduler.SchedulingMode
maxId() - Static method in class org.apache.spark.scheduler.TaskLocality
maxId() - Static method in class org.apache.spark.streaming.scheduler.ReceiverState
maxId() - Static method in class org.apache.spark.TaskState
maxIter() - Method in interface org.apache.spark.ml.param.shared.HasMaxIter: Param for maximum number of iterations (>= 0).
maxIters() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
maxLocalProjDBSize() - Method in class org.apache.spark.ml.fpm.PrefixSpan: Param for the maximum number of items (including delimiters used in the internal storage format) allowed in a projected database before local processing (default: 32000000).
maxMem() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
maxMemory() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
maxMemory() - Method in class org.apache.spark.status.LiveExecutor
maxMemoryInMB() - Method in interface org.apache.spark.ml.tree.DecisionTreeParams: Maximum memory in MB allocated to histogram aggregation.
maxMemoryInMB() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
maxMessageSizeBytes(SparkConf) - Static method in class org.apache.spark.util.RpcUtils: Returns the configured max message size for messages in bytes.
maxNodesInLevel(int) - Static method in class org.apache.spark.mllib.tree.model.Node: Return the maximum number of nodes which can be in the given level of the tree.
maxNumConcurrentTasks() - Method in interface org.apache.spark.scheduler.SchedulerBackend: Get the max number of tasks that can be concurrent launched currently.
maxOffHeapMem() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
maxOffHeapMemSize() - Method in class org.apache.spark.storage.BlockManagerMessages.RegisterBlockManager
maxOnHeapMem() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
maxOnHeapMemSize() - Method in class org.apache.spark.storage.BlockManagerMessages.RegisterBlockManager
maxPatternLength() - Method in class org.apache.spark.ml.fpm.PrefixSpan: Param for the maximal pattern length (default: 10).
maxPrecisionForBytes(int) - Static method in class org.apache.spark.sql.types.Decimal
maxReplicas() - Method in class org.apache.spark.storage.BlockManagerMessages.ReplicateBlock
maxSentenceLength() - Method in interface org.apache.spark.ml.feature.Word2VecBase: Sets the maximum length (in words) of each sentence in the input data.
maxSplitFeatureIndex() - Method in interface org.apache.spark.ml.tree.DecisionTreeModel: Trace down the tree, and return the largest feature index used in any split.
maxTasks() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
maxTasks() - Method in class org.apache.spark.status.LiveExecutor
maxVal() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
maybeUpdateOutputMetrics(OutputMetrics, Function0<Object>, long) - Static method in class org.apache.spark.internal.io.SparkHadoopWriterUtils
md5(Column) - Static method in class org.apache.spark.sql.functions: Calculates the MD5 digest of a binary column and returns the value as a 32 character hex string.
mean() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Compute the mean of this RDD's elements.
mean() - Method in class org.apache.spark.ml.feature.StandardScalerModel
mean() - Method in class org.apache.spark.ml.stat.distribution.MultivariateGaussian
mean(Column, Column) - Static method in class org.apache.spark.ml.stat.Summarizer
mean(Column) - Static method in class org.apache.spark.ml.stat.Summarizer
mean() - Method in class org.apache.spark.mllib.feature.StandardScalerModel
mean() - Method in class org.apache.spark.mllib.random.ExponentialGenerator
mean() - Method in class org.apache.spark.mllib.random.LogNormalGenerator
mean() - Method in class org.apache.spark.mllib.random.PoissonGenerator
mean() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer: Sample mean of each dimension.
mean() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary: Sample mean vector.
mean() - Method in class org.apache.spark.partial.BoundedDouble
mean() - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Compute the mean of this RDD's elements.
mean(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the average of the values in a group.
mean(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the average of the values in a group.
mean(String...) - Method in class org.apache.spark.sql.RelationalGroupedDataset: Compute the average value for each numeric columns for each group.
mean(Seq<String>) - Method in class org.apache.spark.sql.RelationalGroupedDataset: Compute the average value for each numeric columns for each group.
mean() - Method in class org.apache.spark.util.StatCounter
meanAbsoluteError() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary: Returns the mean absolute error, which is a risk function corresponding to the expected value of the absolute error loss or l1-norm loss.
meanAbsoluteError() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics: Returns the mean absolute error, which is a risk function corresponding to the expected value of the absolute error loss or l1-norm loss.
meanApprox(long, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return the approximate mean of the elements in this RDD.
meanApprox(long) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Approximate operation to return the mean within a timeout.
meanApprox(long, double) - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Approximate operation to return the mean within a timeout.
meanAveragePrecision() - Method in class org.apache.spark.mllib.evaluation.RankingMetrics: Returns the mean average precision (MAP) of all the queries.
means() - Method in class org.apache.spark.ml.clustering.ExpectationAggregator
means() - Method in class org.apache.spark.mllib.clustering.ExpectationSum
meanSquaredError() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary: Returns the mean squared error, which is a risk function corresponding to the expected value of the squared error loss or quadratic loss.
meanSquaredError() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics: Returns the mean squared error, which is a risk function corresponding to the expected value of the squared error loss or quadratic loss.
megabytesToString(long) - Static method in class org.apache.spark.util.Utils: Convert a quantity in megabytes to a human-readable string such as "4.0 MB".
MEM_SPILL() - Static method in class org.apache.spark.status.TaskIndexNames
MEMORY_AND_DISK - Static variable in class org.apache.spark.api.java.StorageLevels
MEMORY_AND_DISK() - Static method in class org.apache.spark.storage.StorageLevel
MEMORY_AND_DISK_2 - Static variable in class org.apache.spark.api.java.StorageLevels
MEMORY_AND_DISK_2() - Static method in class org.apache.spark.storage.StorageLevel
MEMORY_AND_DISK_SER - Static variable in class org.apache.spark.api.java.StorageLevels
MEMORY_AND_DISK_SER() - Static method in class org.apache.spark.storage.StorageLevel
MEMORY_AND_DISK_SER_2 - Static variable in class org.apache.spark.api.java.StorageLevels
MEMORY_AND_DISK_SER_2() - Static method in class org.apache.spark.storage.StorageLevel
MEMORY_BYTES_SPILLED() - Static method in class org.apache.spark.InternalAccumulator
MEMORY_ONLY - Static variable in class org.apache.spark.api.java.StorageLevels
MEMORY_ONLY() - Static method in class org.apache.spark.storage.StorageLevel
MEMORY_ONLY_2 - Static variable in class org.apache.spark.api.java.StorageLevels
MEMORY_ONLY_2() - Static method in class org.apache.spark.storage.StorageLevel
MEMORY_ONLY_SER - Static variable in class org.apache.spark.api.java.StorageLevels
MEMORY_ONLY_SER() - Static method in class org.apache.spark.storage.StorageLevel
MEMORY_ONLY_SER_2 - Static variable in class org.apache.spark.api.java.StorageLevels
MEMORY_ONLY_SER_2() - Static method in class org.apache.spark.storage.StorageLevel
memoryBytesSpilled() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
memoryBytesSpilled() - Method in class org.apache.spark.status.api.v1.StageData
memoryBytesSpilled() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
memoryBytesSpilled() - Method in class org.apache.spark.status.api.v1.TaskMetrics
memoryCost(int, int) - Static method in class org.apache.spark.mllib.feature.PCAUtil
MemoryEntry<T> - Interface in org.apache.spark.storage.memory
MemoryEntryBuilder<T> - Interface in org.apache.spark.storage.memory
memoryManager() - Method in class org.apache.spark.SparkEnv
memoryMetrics() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
MemoryMetrics - Class in org.apache.spark.status.api.v1
memoryMode() - Method in class org.apache.spark.storage.memory.DeserializedMemoryEntry
memoryMode() - Method in interface org.apache.spark.storage.memory.MemoryEntry
memoryMode() - Method in class org.apache.spark.storage.memory.SerializedMemoryEntry
MemoryParam - Class in org.apache.spark.util: An extractor object for parsing JVM memory strings, such as "10g", into an Int representing the number of megabytes.
MemoryParam() - Constructor for class org.apache.spark.util.MemoryParam
memoryPerExecutorMB() - Method in class org.apache.spark.status.api.v1.ApplicationInfo
memoryRemaining() - Method in class org.apache.spark.status.api.v1.RDDDataDistribution
memoryStringToMb(String) - Static method in class org.apache.spark.util.Utils: Convert a Java memory parameter passed to -Xmx (such as 300m or 1g) to a number of mebibytes.
memoryUsed() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
memoryUsed() - Method in class org.apache.spark.status.api.v1.RDDDataDistribution
memoryUsed() - Method in class org.apache.spark.status.api.v1.RDDPartitionInfo
memoryUsed() - Method in class org.apache.spark.status.api.v1.RDDStorageInfo
memoryUsed() - Method in class org.apache.spark.status.LiveExecutor
memoryUsed() - Method in class org.apache.spark.status.LiveRDD
memoryUsed() - Method in class org.apache.spark.status.LiveRDDDistribution
memoryUsed() - Method in class org.apache.spark.status.LiveRDDPartition
memoryUsedBytes() - Method in class org.apache.spark.sql.streaming.StateOperatorProgress
memSize() - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
memSize() - Method in class org.apache.spark.storage.BlockStatus
memSize() - Method in class org.apache.spark.storage.BlockUpdatedInfo
memSize() - Method in class org.apache.spark.storage.RDDInfo
merge(R) - Method in class org.apache.spark.Accumulable: Deprecated.

Merge two accumulable objects together
merge(ExpectationAggregator) - Method in class org.apache.spark.ml.clustering.ExpectationAggregator: Merge another ExpectationAggregator, update the weights, means and covariances for each distributions, and update the log likelihood.
merge(Agg) - Method in interface org.apache.spark.ml.optim.aggregator.DifferentiableLossAggregator: Merge two aggregators.
merge(AFTAggregator) - Method in class org.apache.spark.ml.regression.AFTAggregator: Merge another AFTAggregator, and update the loss and gradient of the objective function.
merge(IDF.DocumentFrequencyAggregator) - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator: Merges another.
merge(MultivariateOnlineSummarizer) - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer: Merge another MultivariateOnlineSummarizer, and update the statistical summary.
merge(int, U) - Method in interface org.apache.spark.partial.ApproximateEvaluator
merge(BUF, BUF) - Method in class org.apache.spark.sql.expressions.Aggregator: Merge two intermediate values.
merge(MutableAggregationBuffer, Row) - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction: Merges two aggregation buffers and stores the updated buffer values back to buffer1.
merge(AccumulatorV2<IN, OUT>) - Method in class org.apache.spark.util.AccumulatorV2: Merges another same-type accumulator into this one and update its state, i.e.
merge(AccumulatorV2<T, List<T>>) - Method in class org.apache.spark.util.CollectionAccumulator
merge(AccumulatorV2<Double, Double>) - Method in class org.apache.spark.util.DoubleAccumulator
merge(AccumulatorV2<T, R>) - Method in class org.apache.spark.util.LegacyAccumulatorWrapper
merge(AccumulatorV2<Long, Long>) - Method in class org.apache.spark.util.LongAccumulator
merge(double) - Method in class org.apache.spark.util.StatCounter: Add a value into this StatCounter, updating the internal statistics.
merge(TraversableOnce<Object>) - Method in class org.apache.spark.util.StatCounter: Add multiple values into this StatCounter, updating the internal statistics.
merge(StatCounter) - Method in class org.apache.spark.util.StatCounter: Merge another StatCounter into this one, adding up the internal statistics.
mergeCombiners() - Method in class org.apache.spark.Aggregator
mergeInPlace(BloomFilter) - Method in class org.apache.spark.util.sketch.BloomFilter: Combines this bloom filter with another bloom filter by performing a bitwise OR of the underlying data.
mergeInPlace(CountMinSketch) - Method in class org.apache.spark.util.sketch.CountMinSketch: Merges another CountMinSketch with this one in place.
mergeOffsets(PartitionOffset[]) - Method in interface org.apache.spark.sql.sources.v2.reader.streaming.ContinuousReader: Merge partitioned offsets coming from ContinuousInputPartitionReader instances for each partition to a single global offset.
mergeValue() - Method in class org.apache.spark.Aggregator
message() - Method in class org.apache.spark.FetchFailed
message() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutorFailed
message() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveWorker
message() - Static method in class org.apache.spark.scheduler.ExecutorKilled
message() - Static method in class org.apache.spark.scheduler.LossReasonPending
message() - Method in exception org.apache.spark.sql.AnalysisException
message() - Method in exception org.apache.spark.sql.streaming.StreamingQueryException
message() - Method in class org.apache.spark.sql.streaming.StreamingQueryStatus
MetaAlgorithmReadWrite - Class in org.apache.spark.ml.util: Default Meta-Algorithm read and write implementation.
MetaAlgorithmReadWrite() - Constructor for class org.apache.spark.ml.util.MetaAlgorithmReadWrite
Metadata - Class in org.apache.spark.sql.types: Metadata is a wrapper over Map[String, Any] that limits the value type to simple ones: Boolean, Long, Double, String, Metadata, Array[Boolean], Array[Long], Array[Double], Array[String], and Array[Metadata].
metadata() - Method in class org.apache.spark.sql.types.StructField
metadata() - Method in class org.apache.spark.streaming.scheduler.StreamInputInfo
METADATA_KEY_DESCRIPTION() - Static method in class org.apache.spark.streaming.scheduler.StreamInputInfo: The key for description in StreamInputInfo.metadata.
MetadataBuilder - Class in org.apache.spark.sql.types: Builder for Metadata.
MetadataBuilder() - Constructor for class org.apache.spark.sql.types.MetadataBuilder
metadataDescription() - Method in class org.apache.spark.streaming.scheduler.StreamInputInfo
MetadataUtils - Class in org.apache.spark.ml.util: Helper utilities for algorithms using ML metadata
MetadataUtils() - Constructor for class org.apache.spark.ml.util.MetadataUtils
Method(String, Function2<Object, Object, Object>) - Constructor for class org.apache.spark.mllib.stat.test.ChiSqTest.Method
method() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
Method$() - Constructor for class org.apache.spark.mllib.stat.test.ChiSqTest.Method$
MethodIdentifier<T> - Class in org.apache.spark.util: Helper class to identify a method.
MethodIdentifier(Class<T>, String, String) - Constructor for class org.apache.spark.util.MethodIdentifier
methodName() - Method in interface org.apache.spark.mllib.stat.test.StreamingTestMethod
methodName() - Static method in class org.apache.spark.mllib.stat.test.StudentTTest
methodName() - Static method in class org.apache.spark.mllib.stat.test.WelchTTest
METRIC_COMPILATION_TIME() - Static method in class org.apache.spark.metrics.source.CodegenMetrics: Histogram of the time it took to compile source code text (in milliseconds).
METRIC_FILE_CACHE_HITS() - Static method in class org.apache.spark.metrics.source.HiveCatalogMetrics: Tracks the total number of files served from the file status cache instead of discovered.
METRIC_FILES_DISCOVERED() - Static method in class org.apache.spark.metrics.source.HiveCatalogMetrics: Tracks the total number of files discovered off of the filesystem by InMemoryFileIndex.
METRIC_GENERATED_CLASS_BYTECODE_SIZE() - Static method in class org.apache.spark.metrics.source.CodegenMetrics: Histogram of the bytecode size of each class generated by CodeGenerator.
METRIC_GENERATED_METHOD_BYTECODE_SIZE() - Static method in class org.apache.spark.metrics.source.CodegenMetrics: Histogram of the bytecode size of each method in classes generated by CodeGenerator.
METRIC_HIVE_CLIENT_CALLS() - Static method in class org.apache.spark.metrics.source.HiveCatalogMetrics: Tracks the total number of Hive client calls (e.g.
METRIC_PARALLEL_LISTING_JOB_COUNT() - Static method in class org.apache.spark.metrics.source.HiveCatalogMetrics: Tracks the total number of Spark jobs launched for parallel file listing.
METRIC_PARTITIONS_FETCHED() - Static method in class org.apache.spark.metrics.source.HiveCatalogMetrics: Tracks the total number of partition metadata entries fetched via the client api.
METRIC_SOURCE_CODE_SIZE() - Static method in class org.apache.spark.metrics.source.CodegenMetrics: Histogram of the length of source code text compiled by CodeGenerator (in characters).
metricName() - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator: param for metric name in evaluation (supports "areaUnderROC" (default), "areaUnderPR")
metricName() - Method in class org.apache.spark.ml.evaluation.ClusteringEvaluator: param for metric name in evaluation (supports "silhouette" (default))
metricName() - Method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator: param for metric name in evaluation (supports "f1" (default), "weightedPrecision", "weightedRecall", "accuracy")
metricName() - Method in class org.apache.spark.ml.evaluation.RegressionEvaluator: Param for metric name in evaluation.
metricRegistry() - Static method in class org.apache.spark.metrics.source.CodegenMetrics
metricRegistry() - Static method in class org.apache.spark.metrics.source.HiveCatalogMetrics
metricRegistry() - Method in interface org.apache.spark.metrics.source.Source
metrics(String...) - Static method in class org.apache.spark.ml.stat.Summarizer: Given a list of metrics, provides a builder that it turns computes metrics from a column.
metrics(Seq<String>) - Static method in class org.apache.spark.ml.stat.Summarizer: Given a list of metrics, provides a builder that it turns computes metrics from a column.
metrics() - Method in class org.apache.spark.status.LiveExecutorStageSummary
metrics() - Method in class org.apache.spark.status.LiveStage
METRICS_PREFIX() - Static method in class org.apache.spark.InternalAccumulator
metricsSystem() - Method in class org.apache.spark.SparkEnv
MFDataGenerator - Class in org.apache.spark.mllib.util: :: DeveloperApi :: Generate RDD(s) containing data for Matrix Factorization.
MFDataGenerator() - Constructor for class org.apache.spark.mllib.util.MFDataGenerator
MicroBatchReader - Interface in org.apache.spark.sql.sources.v2.reader.streaming: A mix-in interface for DataSourceReader.
MicroBatchReadSupport - Interface in org.apache.spark.sql.sources.v2: A mix-in interface for DataSourceV2.
microF1Measure() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics: Returns micro-averaged label-based f1-measure (equals to micro-averaged document-based f1-measure)
microPrecision() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics: Returns micro-averaged label-based precision (equals to micro-averaged document-based precision)
microRecall() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics: Returns micro-averaged label-based recall (equals to micro-averaged document-based recall)
mightContain(Object) - Method in class org.apache.spark.util.sketch.BloomFilter: Returns true if the element might have been put in this Bloom filter, false if this is definitely not the case.
mightContainBinary(byte[]) - Method in class org.apache.spark.util.sketch.BloomFilter: A specialized variant of BloomFilter.mightContain(Object) that only tests byte array items.
mightContainLong(long) - Method in class org.apache.spark.util.sketch.BloomFilter: A specialized variant of BloomFilter.mightContain(Object) that only tests long items.
mightContainString(String) - Method in class org.apache.spark.util.sketch.BloomFilter: A specialized variant of BloomFilter.mightContain(Object) that only tests String items.
milliseconds() - Method in class org.apache.spark.streaming.Duration
milliseconds(long) - Static method in class org.apache.spark.streaming.Durations
Milliseconds - Class in org.apache.spark.streaming: Helper object that creates instance of Duration representing a given number of milliseconds.
Milliseconds() - Constructor for class org.apache.spark.streaming.Milliseconds
milliseconds() - Method in class org.apache.spark.streaming.Time
millisToString(long) - Static method in class org.apache.spark.scheduler.StatsReportListener: Reformat a time interval in milliseconds to a prettier format for output
min() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Returns the minimum element from this RDD as defined by the default comparator natural order.
min(Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Returns the minimum element from this RDD as defined by the specified Comparator[T].
MIN() - Static method in class org.apache.spark.ml.attribute.AttributeKeys
min() - Method in class org.apache.spark.ml.attribute.NumericAttribute
min() - Method in interface org.apache.spark.ml.feature.MinMaxScalerParams: lower bound after transformation, shared by all features Default: 0.0
min(Column, Column) - Static method in class org.apache.spark.ml.stat.Summarizer
min(Column) - Static method in class org.apache.spark.ml.stat.Summarizer
min() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer: Minimum value of each dimension.
min() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary: Minimum value of each column.
min(Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Returns the min of this RDD as defined by the implicit Ordering[T].
min(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the minimum value of the expression in a group.
min(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the minimum value of the column in a group.
min(String...) - Method in class org.apache.spark.sql.RelationalGroupedDataset: Compute the min value for each numeric column for each group.
min(Seq<String>) - Method in class org.apache.spark.sql.RelationalGroupedDataset: Compute the min value for each numeric column for each group.
min(Duration) - Method in class org.apache.spark.streaming.Duration
min(Time) - Method in class org.apache.spark.streaming.Time
min() - Method in class org.apache.spark.util.StatCounter
minBytesForPrecision() - Static method in class org.apache.spark.sql.types.Decimal
minConfidence() - Method in interface org.apache.spark.ml.fpm.FPGrowthParams: Minimal confidence for generating Association Rule.
minCount() - Method in interface org.apache.spark.ml.feature.Word2VecBase: The minimum number of times a token must appear to be included in the word2vec model's vocabulary.
minDF() - Method in interface org.apache.spark.ml.feature.CountVectorizerParams: Specifies the minimum number of different documents a term must appear in to be included in the vocabulary.
minDivisibleClusterSize() - Method in interface org.apache.spark.ml.clustering.BisectingKMeansParams: The minimum number of points (if greater than or equal to 1.0) or the minimum proportion of points (if less than 1.0) of a divisible cluster (default: 1.0).
minDocFreq() - Method in interface org.apache.spark.ml.feature.IDFBase: The minimum number of documents in which a term should appear.
minDocFreq() - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
minDocFreq() - Method in class org.apache.spark.mllib.feature.IDF
MinHashLSH - Class in org.apache.spark.ml.feature: :: Experimental ::
MinHashLSH(String) - Constructor for class org.apache.spark.ml.feature.MinHashLSH
MinHashLSH() - Constructor for class org.apache.spark.ml.feature.MinHashLSH
MinHashLSHModel - Class in org.apache.spark.ml.feature: :: Experimental ::
MINIMUM_ADJUSTED_SCALE() - Static method in class org.apache.spark.sql.types.DecimalType
minInfoGain() - Method in interface org.apache.spark.ml.tree.DecisionTreeParams: Minimum information gain for a split to be considered at a tree node.
minInfoGain() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
minInstancesPerNode() - Method in interface org.apache.spark.ml.tree.DecisionTreeParams: Minimum number of instances each child must have after split.
minInstancesPerNode() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
MinMax() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
MinMaxScaler - Class in org.apache.spark.ml.feature: Rescale each feature individually to a common range [min, max] linearly using column summary statistics, which is also known as min-max normalization or Rescaling.
MinMaxScaler(String) - Constructor for class org.apache.spark.ml.feature.MinMaxScaler
MinMaxScaler() - Constructor for class org.apache.spark.ml.feature.MinMaxScaler
MinMaxScalerModel - Class in org.apache.spark.ml.feature: Model fitted by MinMaxScaler.
MinMaxScalerParams - Interface in org.apache.spark.ml.feature: Params for MinMaxScaler and MinMaxScalerModel.
minorVersion(String) - Static method in class org.apache.spark.util.VersionUtils: Given a Spark version string, return the minor version number.
minSamplingRate() - Static method in class org.apache.spark.util.random.BinomialBounds
minShare() - Method in interface org.apache.spark.scheduler.Schedulable
minSupport() - Method in interface org.apache.spark.ml.fpm.FPGrowthParams: Minimal support level of the frequent pattern.
minSupport() - Method in class org.apache.spark.ml.fpm.PrefixSpan: Param for the minimal support level (default: 0.1).
minTF() - Method in interface org.apache.spark.ml.feature.CountVectorizerParams: Filter to ignore rare words in a document.
minTokenLength() - Method in class org.apache.spark.ml.feature.RegexTokenizer: Minimum token length, greater than or equal to 0.
minus(RDD<Tuple2<Object, VD>>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
minus(VertexRDD<VD>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
minus(RDD<Tuple2<Object, VD>>) - Method in class org.apache.spark.graphx.VertexRDD: For each VertexId present in both this and other, minus will act as a set difference operation returning only those unique VertexId's present in this.
minus(VertexRDD<VD>) - Method in class org.apache.spark.graphx.VertexRDD: For each VertexId present in both this and other, minus will act as a set difference operation returning only those unique VertexId's present in this.
minus(Object) - Method in class org.apache.spark.sql.Column: Subtraction.
minus(Decimal, Decimal) - Method in interface org.apache.spark.sql.types.Decimal.DecimalIsConflicted
minus(Duration) - Method in class org.apache.spark.streaming.Duration
minus(Time) - Method in class org.apache.spark.streaming.Time
minus(Duration) - Method in class org.apache.spark.streaming.Time
minute(Column) - Static method in class org.apache.spark.sql.functions: Extracts the minutes as an integer from a given date/timestamp/string.
minutes() - Static method in class org.apache.spark.scheduler.StatsReportListener
minutes(long) - Static method in class org.apache.spark.streaming.Durations
Minutes - Class in org.apache.spark.streaming: Helper object that creates instance of Duration representing a given number of minutes.
Minutes() - Constructor for class org.apache.spark.streaming.Minutes
minVal() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
missingValue() - Method in interface org.apache.spark.ml.feature.ImputerParams: The placeholder for the missing values.
mkList() - Static method in class org.apache.spark.ml.feature.RFormulaParser
mkString() - Method in interface org.apache.spark.sql.Row: Displays all elements of this sequence in a string (without a separator).
mkString(String) - Method in interface org.apache.spark.sql.Row: Displays all elements of this sequence in a string using a separator string.
mkString(String, String, String) - Method in interface org.apache.spark.sql.Row: Displays all elements of this traversable or iterator in a string using start, end, and separator strings.
mkString(String, String, String) - Method in class org.apache.spark.status.api.v1.StackTrace
ML_ATTR() - Static method in class org.apache.spark.ml.attribute.AttributeKeys
mlDenseMatrixToMLlibDenseMatrix(DenseMatrix) - Static method in class org.apache.spark.mllib.linalg.MatrixImplicits
mlDenseVectorToMLlibDenseVector(DenseVector) - Static method in class org.apache.spark.mllib.linalg.VectorImplicits
MLFormatRegister - Interface in org.apache.spark.ml.util: ML export formats for should implement this trait so that users can specify a shortname rather than the fully qualified class name of the exporter.
mllibDenseMatrixToMLDenseMatrix(DenseMatrix) - Static method in class org.apache.spark.mllib.linalg.MatrixImplicits
mllibDenseVectorToMLDenseVector(DenseVector) - Static method in class org.apache.spark.mllib.linalg.VectorImplicits
mllibMatrixToMLMatrix(Matrix) - Static method in class org.apache.spark.mllib.linalg.MatrixImplicits
mllibSparseMatrixToMLSparseMatrix(SparseMatrix) - Static method in class org.apache.spark.mllib.linalg.MatrixImplicits
mllibSparseVectorToMLSparseVector(SparseVector) - Static method in class org.apache.spark.mllib.linalg.VectorImplicits
mllibVectorToMLVector(Vector) - Static method in class org.apache.spark.mllib.linalg.VectorImplicits
mlMatrixToMLlibMatrix(Matrix) - Static method in class org.apache.spark.mllib.linalg.MatrixImplicits
MLPairRDDFunctions<K,V> - Class in org.apache.spark.mllib.rdd: :: DeveloperApi :: Machine learning specific Pair RDD functions.
MLPairRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.mllib.rdd.MLPairRDDFunctions
MLReadable<T> - Interface in org.apache.spark.ml.util: Trait for objects that provide MLReader.
MLReader<T> - Class in org.apache.spark.ml.util: Abstract class for utility classes that can load ML instances.
MLReader() - Constructor for class org.apache.spark.ml.util.MLReader
mlSparseMatrixToMLlibSparseMatrix(SparseMatrix) - Static method in class org.apache.spark.mllib.linalg.MatrixImplicits
mlSparseVectorToMLlibSparseVector(SparseVector) - Static method in class org.apache.spark.mllib.linalg.VectorImplicits
MLUtils - Class in org.apache.spark.mllib.util: Helper methods to load, save and pre-process data used in MLLib.
MLUtils() - Constructor for class org.apache.spark.mllib.util.MLUtils
mlVectorToMLlibVector(Vector) - Static method in class org.apache.spark.mllib.linalg.VectorImplicits
MLWritable - Interface in org.apache.spark.ml.util: Trait for classes that provide MLWriter.
MLWriter - Class in org.apache.spark.ml.util: Abstract class for utility classes that can save ML instances in Spark's internal format.
MLWriter() - Constructor for class org.apache.spark.ml.util.MLWriter
MLWriterFormat - Interface in org.apache.spark.ml.util: Abstract class to be implemented by objects that provide ML exportability.
mod(Object) - Method in class org.apache.spark.sql.Column: Modulo (a.k.a.
mode(SaveMode) - Method in class org.apache.spark.sql.DataFrameWriter: Specifies the behavior when data or table already exists.
mode(String) - Method in class org.apache.spark.sql.DataFrameWriter: Specifies the behavior when data or table already exists.
mode() - Method in class org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand
model(Vector) - Method in interface org.apache.spark.ml.ann.Topology
model(long) - Method in interface org.apache.spark.ml.ann.Topology
Model<M extends Model<M>> - Class in org.apache.spark.ml: :: DeveloperApi :: A fitted model, i.e., a Transformer produced by an Estimator.
Model() - Constructor for class org.apache.spark.ml.Model
models() - Method in class org.apache.spark.ml.classification.OneVsRestModel
modelType() - Method in interface org.apache.spark.ml.classification.NaiveBayesParams: The model type which is a string (case-sensitive).
modelType() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
modelType() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV2_0$.Data
MODULE$ - Static variable in class org.apache.spark.AccumulatorParam.DoubleAccumulatorParam$: Deprecated.

Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.AccumulatorParam.FloatAccumulatorParam$: Deprecated.

Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.AccumulatorParam.IntAccumulatorParam$: Deprecated.

Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.AccumulatorParam.LongAccumulatorParam$: Deprecated.

Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.AccumulatorParam.StringAccumulatorParam$: Deprecated.

Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.graphx.PartitionStrategy.CanonicalRandomVertexCut$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.graphx.PartitionStrategy.EdgePartition1D$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.graphx.PartitionStrategy.EdgePartition2D$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.graphx.PartitionStrategy.RandomVertexCut$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.internal.io.FileCommitProtocol.EmptyTaskCommitMessage$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.InternalAccumulator.input$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.InternalAccumulator.output$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.InternalAccumulator.shuffleRead$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.InternalAccumulator.shuffleWrite$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.ml.evaluation.SquaredEuclideanSilhouette.ClusterStats$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.ml.feature.Word2VecModel.Word2VecModelWriter$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.ml.Pipeline.SharedReadWrite$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.ml.recommendation.ALS.InBlock$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.ml.recommendation.ALS.Rating$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.ml.recommendation.ALS.RatingBlock$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Binomial$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.ml.regression.GeneralizedLinearRegression.CLogLog$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Family$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.ml.regression.GeneralizedLinearRegression.FamilyAndLink$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Gamma$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Gaussian$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Identity$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Inverse$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Link$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Log$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Logit$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Poisson$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Probit$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Sqrt$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Tweedie$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.ml.tree.DecisionTreeModelReadWrite.NodeData$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.ml.tree.DecisionTreeModelReadWrite.SplitData$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.ml.tree.EnsembleModelReadWrite.EnsembleNodeData$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$.Data$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV1_0$.Data$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV1_0$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV2_0$.Data$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV2_0$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.clustering.BisectingKMeansModel.SaveLoadV1_0$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.clustering.BisectingKMeansModel.SaveLoadV2_0$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.clustering.KMeansModel.SaveLoadV1_0$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.clustering.KMeansModel.SaveLoadV2_0$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.clustering.PowerIterationClustering.Assignment$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.clustering.PowerIterationClusteringModel.SaveLoadV1_0$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.feature.ChiSqSelectorModel.SaveLoadV1_0$.Data$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.feature.ChiSqSelectorModel.SaveLoadV1_0$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.fpm.FPGrowthModel.SaveLoadV1_0$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.fpm.PrefixSpan.Postfix$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.fpm.PrefixSpan.Prefix$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.fpm.PrefixSpanModel.SaveLoadV1_0$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel.SaveLoadV1_0$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.regression.impl.GLMRegressionModel.SaveLoadV1_0$.Data$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.regression.impl.GLMRegressionModel.SaveLoadV1_0$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.stat.test.ChiSqTest.Method$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.stat.test.ChiSqTest.NullHypothesis$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.stat.test.KolmogorovSmirnovTest.NullHypothesis$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.PredictData$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.rdd.HadoopRDD.HadoopMapPartitionsWithSplitRDD$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.rdd.NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.GetExecutorLossReason$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillExecutors$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillExecutorsOnHost$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillTask$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.LaunchTask$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterClusterManager$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisteredExecutor$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutorFailed$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveExecutor$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveWorker$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RequestExecutors$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RetrieveLastAllocatedExecutorId$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RetrieveSparkAppConfig$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.ReviveOffers$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.SetupDriver$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.Shutdown$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.SparkAppConfig$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopDriver$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopExecutor$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopExecutors$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.UpdateDelegationTokens$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.serializer.SerializationDebugger.ObjectStreamClassMethods$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.sql.hive.HiveShim.HiveFunctionWrapper$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.sql.hive.HiveStrategies.HiveTableScans$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.sql.hive.HiveStrategies.Scripts$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.sql.RelationalGroupedDataset.CubeType$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.sql.RelationalGroupedDataset.GroupByType$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.sql.RelationalGroupedDataset.PivotType$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.sql.RelationalGroupedDataset.RollupType$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.sql.types.Decimal.DecimalAsIfIntegral$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.sql.types.Decimal.DecimalIsFractional$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.sql.types.DecimalType.Expression$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.sql.types.DecimalType.Fixed$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.BlockLocationsAndStatus$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.BlockManagerHeartbeat$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetBlockStatus$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetExecutorEndpointRef$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetLocations$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetLocationsAndStatus$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetLocationsMultipleBlockIds$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetMatchingBlockIds$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetMemoryStatus$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetPeers$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.GetStorageStatus$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.HasCachedBlocks$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.RegisterBlockManager$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.RemoveBlock$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.RemoveBroadcast$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.RemoveExecutor$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.RemoveRdd$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.RemoveShuffle$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.ReplicateBlock$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.StopBlockManagerMaster$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.TriggerThreadDump$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.ui.JettyUtils.ServletParams$: Static reference to the singleton instance of this Scala object.
monotonically_increasing_id() - Static method in class org.apache.spark.sql.functions: A column expression that generates monotonically increasing 64-bit integers.
monotonicallyIncreasingId() - Static method in class org.apache.spark.sql.functions: Deprecated.
Use monotonically_increasing_id(). Since 2.0.0.
month(Column) - Static method in class org.apache.spark.sql.functions: Extracts the month as an integer from a given date/timestamp/string.
months_between(Column, Column) - Static method in class org.apache.spark.sql.functions: Returns number of months between dates start and end.
months_between(Column, Column, boolean) - Static method in class org.apache.spark.sql.functions: Returns number of months between dates end and start.
msDurationToString(long) - Static method in class org.apache.spark.util.Utils: Returns a human-readable string representing a duration such as "35ms"
MsSqlServerDialect - Class in org.apache.spark.sql.jdbc
MsSqlServerDialect() - Constructor for class org.apache.spark.sql.jdbc.MsSqlServerDialect
mu() - Method in class org.apache.spark.mllib.stat.distribution.MultivariateGaussian
MulticlassClassificationEvaluator - Class in org.apache.spark.ml.evaluation: :: Experimental :: Evaluator for multiclass classification, which expects two input columns: prediction and label.
MulticlassClassificationEvaluator(String) - Constructor for class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
MulticlassClassificationEvaluator() - Constructor for class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
multiclassMetrics() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary
MulticlassMetrics - Class in org.apache.spark.mllib.evaluation: Evaluator for multiclass classification.
MulticlassMetrics(RDD<Tuple2<Object, Object>>) - Constructor for class org.apache.spark.mllib.evaluation.MulticlassMetrics
MultilabelMetrics - Class in org.apache.spark.mllib.evaluation: Evaluator for multilabel classification.
MultilabelMetrics(RDD<Tuple2<double[], double[]>>) - Constructor for class org.apache.spark.mllib.evaluation.MultilabelMetrics
multiLabelValidator(int) - Static method in class org.apache.spark.mllib.util.DataValidators: Function to check if labels used for k class multi-label classification are in the range of {0, 1, ..., k - 1}.
MultilayerPerceptronClassificationModel - Class in org.apache.spark.ml.classification: Classification model based on the Multilayer Perceptron.
MultilayerPerceptronClassifier - Class in org.apache.spark.ml.classification: Classifier trainer based on the Multilayer Perceptron.
MultilayerPerceptronClassifier(String) - Constructor for class org.apache.spark.ml.classification.MultilayerPerceptronClassifier
MultilayerPerceptronClassifier() - Constructor for class org.apache.spark.ml.classification.MultilayerPerceptronClassifier
MultilayerPerceptronParams - Interface in org.apache.spark.ml.classification: Params for Multilayer Perceptron.
multiply(DenseMatrix) - Method in interface org.apache.spark.ml.linalg.Matrix: Convenience method for Matrix-DenseMatrix multiplication.
multiply(DenseVector) - Method in interface org.apache.spark.ml.linalg.Matrix: Convenience method for Matrix-DenseVector multiplication.
multiply(Vector) - Method in interface org.apache.spark.ml.linalg.Matrix: Convenience method for Matrix-Vector multiplication.
multiply(BlockMatrix) - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix: Left multiplies this BlockMatrix to other, another BlockMatrix.
multiply(BlockMatrix, int) - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix: Left multiplies this BlockMatrix to other, another BlockMatrix.
multiply(Matrix) - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix: Multiply this matrix by a local matrix on the right.
multiply(Matrix) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Multiply this matrix by a local matrix on the right.
multiply(DenseMatrix) - Method in interface org.apache.spark.mllib.linalg.Matrix: Convenience method for Matrix-DenseMatrix multiplication.
multiply(DenseVector) - Method in interface org.apache.spark.mllib.linalg.Matrix: Convenience method for Matrix-DenseVector multiplication.
multiply(Vector) - Method in interface org.apache.spark.mllib.linalg.Matrix: Convenience method for Matrix-Vector multiplication.
multiply(Object) - Method in class org.apache.spark.sql.Column: Multiplication of this expression and another expression.
MultivariateGaussian - Class in org.apache.spark.ml.stat.distribution: This class provides basic functionality for a Multivariate Gaussian (Normal) Distribution.
MultivariateGaussian(Vector, Matrix) - Constructor for class org.apache.spark.ml.stat.distribution.MultivariateGaussian
MultivariateGaussian - Class in org.apache.spark.mllib.stat.distribution: :: DeveloperApi :: This class provides basic functionality for a Multivariate Gaussian (Normal) Distribution.
MultivariateGaussian(Vector, Matrix) - Constructor for class org.apache.spark.mllib.stat.distribution.MultivariateGaussian
MultivariateOnlineSummarizer - Class in org.apache.spark.mllib.stat: :: DeveloperApi :: MultivariateOnlineSummarizer implements MultivariateStatisticalSummary to compute the mean, variance, minimum, maximum, counts, and nonzero counts for instances in sparse or dense vector format in an online fashion.
MultivariateOnlineSummarizer() - Constructor for class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
MultivariateStatisticalSummary - Interface in org.apache.spark.mllib.stat: Trait for multivariate statistical summary of a data matrix.
MutableAggregationBuffer - Class in org.apache.spark.sql.expressions: A Row representing a mutable aggregation buffer.
MutableAggregationBuffer() - Constructor for class org.apache.spark.sql.expressions.MutableAggregationBuffer
MutablePair<T1,T2> - Class in org.apache.spark.util: :: DeveloperApi :: A tuple of 2 elements.
MutablePair(T1, T2) - Constructor for class org.apache.spark.util.MutablePair
MutablePair() - Constructor for class org.apache.spark.util.MutablePair: No-arg constructor for serialization
MutableURLClassLoader - Class in org.apache.spark.util: URL class loader that exposes the `addURL` method in URLClassLoader.
MutableURLClassLoader(URL[], ClassLoader) - Constructor for class org.apache.spark.util.MutableURLClassLoader
myName() - Method in class org.apache.spark.util.InnerClosureFinder
MySQLDialect - Class in org.apache.spark.sql.jdbc
MySQLDialect() - Constructor for class org.apache.spark.sql.jdbc.MySQLDialect

N

n() - Method in class org.apache.spark.ml.feature.NGram: Minimum n-gram length, greater than or equal to 1.
n() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
na() - Method in class org.apache.spark.sql.Dataset: Returns a DataFrameNaFunctions for working with missing data.
NaiveBayes - Class in org.apache.spark.ml.classification: Naive Bayes Classifiers.
NaiveBayes(String) - Constructor for class org.apache.spark.ml.classification.NaiveBayes
NaiveBayes() - Constructor for class org.apache.spark.ml.classification.NaiveBayes
NaiveBayes - Class in org.apache.spark.mllib.classification: Trains a Naive Bayes model given an RDD of (label, features) pairs.
NaiveBayes(double) - Constructor for class org.apache.spark.mllib.classification.NaiveBayes
NaiveBayes() - Constructor for class org.apache.spark.mllib.classification.NaiveBayes
NaiveBayesModel - Class in org.apache.spark.ml.classification: Model produced by NaiveBayes param: pi log of class priors, whose dimension is C (number of classes) param: theta log of class conditional probabilities, whose dimension is C (number of classes) by D (number of features)
NaiveBayesModel - Class in org.apache.spark.mllib.classification: Model for Naive Bayes Classifiers.
NaiveBayesModel.SaveLoadV1_0$ - Class in org.apache.spark.mllib.classification
NaiveBayesModel.SaveLoadV1_0$.Data - Class in org.apache.spark.mllib.classification: Model data for model import/export
NaiveBayesModel.SaveLoadV1_0$.Data$ - Class in org.apache.spark.mllib.classification
NaiveBayesModel.SaveLoadV2_0$ - Class in org.apache.spark.mllib.classification
NaiveBayesModel.SaveLoadV2_0$.Data - Class in org.apache.spark.mllib.classification: Model data for model import/export
NaiveBayesModel.SaveLoadV2_0$.Data$ - Class in org.apache.spark.mllib.classification
NaiveBayesParams - Interface in org.apache.spark.ml.classification: Params for Naive Bayes Classifiers.
name() - Method in class org.apache.spark.Accumulable: Deprecated.
name() - Method in interface org.apache.spark.api.java.JavaRDDLike
name() - Method in class org.apache.spark.ml.attribute.Attribute: Name of the attribute.
name() - Method in class org.apache.spark.ml.attribute.AttributeGroup
NAME() - Static method in class org.apache.spark.ml.attribute.AttributeKeys
name() - Method in class org.apache.spark.ml.attribute.AttributeType
name() - Method in class org.apache.spark.ml.attribute.BinaryAttribute
name() - Method in class org.apache.spark.ml.attribute.NominalAttribute
name() - Method in class org.apache.spark.ml.attribute.NumericAttribute
name() - Static method in class org.apache.spark.ml.attribute.UnresolvedAttribute
name() - Method in class org.apache.spark.ml.param.Param
name() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Gamma$
name() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Gaussian$
name() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Identity$
name() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Inverse$
name() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Log$
name() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Poisson$
name() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Sqrt$
name() - Method in class org.apache.spark.mllib.stat.test.ChiSqTest.Method
name() - Method in class org.apache.spark.rdd.RDD: A friendly name for this RDD
name() - Method in class org.apache.spark.scheduler.AccumulableInfo
name() - Method in class org.apache.spark.scheduler.AsyncEventQueue
name() - Method in interface org.apache.spark.scheduler.Schedulable
name() - Method in class org.apache.spark.scheduler.StageInfo
name() - Method in interface org.apache.spark.SparkStageInfo
name() - Method in class org.apache.spark.SparkStageInfoImpl
name() - Method in class org.apache.spark.sql.catalog.Column
name() - Method in class org.apache.spark.sql.catalog.Database
name() - Method in class org.apache.spark.sql.catalog.Function
name() - Method in class org.apache.spark.sql.catalog.Table
name(String) - Method in class org.apache.spark.sql.Column: Gives the column a name (alias).
name() - Method in interface org.apache.spark.sql.streaming.StreamingQuery: Returns the user-specified name of the query, or null if not specified.
name() - Method in class org.apache.spark.sql.streaming.StreamingQueryListener.QueryStartedEvent
name() - Method in class org.apache.spark.sql.streaming.StreamingQueryProgress
name(String) - Method in class org.apache.spark.sql.TypedColumn: Gives the TypedColumn a name (alias).
name() - Method in class org.apache.spark.sql.types.StructField
name() - Method in class org.apache.spark.status.api.v1.AccumulableInfo
name() - Method in class org.apache.spark.status.api.v1.ApplicationInfo
name() - Method in class org.apache.spark.status.api.v1.JobData
name() - Method in class org.apache.spark.status.api.v1.RDDStorageInfo
name() - Method in class org.apache.spark.status.api.v1.StageData
name() - Method in class org.apache.spark.status.api.v1.streaming.OutputOperationInfo
name() - Method in class org.apache.spark.storage.BlockId: A globally unique identifier for this Block.
name() - Method in class org.apache.spark.storage.BroadcastBlockId
name() - Method in class org.apache.spark.storage.RDDBlockId
name() - Method in class org.apache.spark.storage.RDDInfo
name() - Method in class org.apache.spark.storage.ShuffleBlockId
name() - Method in class org.apache.spark.storage.ShuffleDataBlockId
name() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
name() - Method in class org.apache.spark.storage.StreamBlockId
name() - Method in class org.apache.spark.storage.TaskResultBlockId
name() - Method in class org.apache.spark.streaming.scheduler.OutputOperationInfo
name() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
name() - Method in class org.apache.spark.util.AccumulatorV2: Returns the name of this accumulator, can only be called after registration.
name() - Method in class org.apache.spark.util.MethodIdentifier
namedThreadFactory(String) - Static method in class org.apache.spark.util.ThreadUtils: Create a thread factory that names threads with a prefix and also sets the threads to daemon.
names() - Method in class org.apache.spark.ml.feature.VectorSlicer: An array of feature names to select features from a vector column.
names() - Method in class org.apache.spark.sql.types.StructType: Returns all field names in an array.
nameToObjectMap() - Static method in class org.apache.spark.mllib.stat.correlation.CorrelationNames
nanSafeCompareDoubles(double, double) - Static method in class org.apache.spark.util.Utils: NaN-safe version of java.lang.Double.compare() which allows NaN values to be compared according to semantics where NaN == NaN and NaN is greater than any non-NaN double.
nanSafeCompareFloats(float, float) - Static method in class org.apache.spark.util.Utils: NaN-safe version of java.lang.Float.compare() which allows NaN values to be compared according to semantics where NaN == NaN and NaN is greater than any non-NaN float.
nanvl(Column, Column) - Static method in class org.apache.spark.sql.functions: Returns col1 if it is not NaN, or col2 if col1 is NaN.
NarrowDependency<T> - Class in org.apache.spark: :: DeveloperApi :: Base class for dependencies where each partition of the child RDD depends on a small number of partitions of the parent RDD.
NarrowDependency(RDD<T>) - Constructor for class org.apache.spark.NarrowDependency
ndcgAt(int) - Method in class org.apache.spark.mllib.evaluation.RankingMetrics: Compute the average NDCG value of all the queries, truncated at ranking position k.
needConversion() - Method in class org.apache.spark.sql.sources.BaseRelation: Whether does it need to convert the objects in Row to internal representation, for example: java.lang.String to UTF8String java.lang.Decimal to Decimal
needsReconfiguration() - Method in interface org.apache.spark.sql.sources.v2.reader.streaming.ContinuousReader: The execution engine will call this method in every epoch to determine if new input partitions need to be generated, which may be required if for example the underlying source system has had partitions added or removed.
negate(Column) - Static method in class org.apache.spark.sql.functions: Unary minus, i.e.
negate(Decimal) - Method in interface org.apache.spark.sql.types.Decimal.DecimalIsConflicted
newAccumulatorInfos(Iterable<AccumulableInfo>) - Static method in class org.apache.spark.status.LiveEntityHelpers
newAPIHadoopFile(String, Class<F>, Class<K>, Class<V>, Configuration) - Method in class org.apache.spark.api.java.JavaSparkContext: Get an RDD for a given Hadoop file with an arbitrary new API InputFormat and extra configuration options to pass to the input format.
newAPIHadoopFile(String, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.SparkContext: Smarter version of newApiHadoopFile that uses class tags to figure out the classes of keys, values and the org.apache.hadoop.mapreduce.InputFormat (new MapReduce API) so that user don't need to pass them directly.
newAPIHadoopFile(String, Class<F>, Class<K>, Class<V>, Configuration) - Method in class org.apache.spark.SparkContext: Get an RDD for a given Hadoop file with an arbitrary new API InputFormat and extra configuration options to pass to the input format.
newAPIHadoopRDD(Configuration, Class<F>, Class<K>, Class<V>) - Method in class org.apache.spark.api.java.JavaSparkContext: Get an RDD for a given Hadoop file with an arbitrary new API InputFormat and extra configuration options to pass to the input format.
newAPIHadoopRDD(Configuration, Class<F>, Class<K>, Class<V>) - Method in class org.apache.spark.SparkContext: Get an RDD for a given Hadoop file with an arbitrary new API InputFormat and extra configuration options to pass to the input format.
newBooleanArrayEncoder() - Method in class org.apache.spark.sql.SQLImplicits
newBooleanEncoder() - Method in class org.apache.spark.sql.SQLImplicits
newBooleanSeqEncoder() - Method in class org.apache.spark.sql.SQLImplicits: Deprecated.
use newSequenceEncoder
newBoxedBooleanEncoder() - Method in class org.apache.spark.sql.SQLImplicits
newBoxedByteEncoder() - Method in class org.apache.spark.sql.SQLImplicits
newBoxedDoubleEncoder() - Method in class org.apache.spark.sql.SQLImplicits
newBoxedFloatEncoder() - Method in class org.apache.spark.sql.SQLImplicits
newBoxedIntEncoder() - Method in class org.apache.spark.sql.SQLImplicits
newBoxedLongEncoder() - Method in class org.apache.spark.sql.SQLImplicits
newBoxedShortEncoder() - Method in class org.apache.spark.sql.SQLImplicits
newBroadcast(T, boolean, long, ClassTag<T>) - Method in interface org.apache.spark.broadcast.BroadcastFactory: Creates a new broadcast variable.
newByteArrayEncoder() - Method in class org.apache.spark.sql.SQLImplicits
newByteEncoder() - Method in class org.apache.spark.sql.SQLImplicits
newByteSeqEncoder() - Method in class org.apache.spark.sql.SQLImplicits: Deprecated.
use newSequenceEncoder
newDaemonCachedThreadPool(String) - Static method in class org.apache.spark.util.ThreadUtils: Wrapper over newCachedThreadPool.
newDaemonCachedThreadPool(String, int, int) - Static method in class org.apache.spark.util.ThreadUtils: Create a cached thread pool whose max number of threads is maxThreadNumber.
newDaemonFixedThreadPool(int, String) - Static method in class org.apache.spark.util.ThreadUtils: Wrapper over newFixedThreadPool.
newDaemonSingleThreadExecutor(String) - Static method in class org.apache.spark.util.ThreadUtils: Wrapper over newSingleThreadExecutor.
newDaemonSingleThreadScheduledExecutor(String) - Static method in class org.apache.spark.util.ThreadUtils: Wrapper over ScheduledThreadPoolExecutor.
newDaemonThreadPoolScheduledExecutor(String, int) - Static method in class org.apache.spark.util.ThreadUtils: Wrapper over ScheduledThreadPoolExecutor.
newDateEncoder() - Method in class org.apache.spark.sql.SQLImplicits
newDoubleArrayEncoder() - Method in class org.apache.spark.sql.SQLImplicits
newDoubleEncoder() - Method in class org.apache.spark.sql.SQLImplicits
newDoubleSeqEncoder() - Method in class org.apache.spark.sql.SQLImplicits: Deprecated.
use newSequenceEncoder
newFloatArrayEncoder() - Method in class org.apache.spark.sql.SQLImplicits
newFloatEncoder() - Method in class org.apache.spark.sql.SQLImplicits
newFloatSeqEncoder() - Method in class org.apache.spark.sql.SQLImplicits: Deprecated.
use newSequenceEncoder
newForkJoinPool(String, int) - Static method in class org.apache.spark.util.ThreadUtils: Construct a new Scala ForkJoinPool with a specified max parallelism and name prefix.
NewHadoopMapPartitionsWithSplitRDD$() - Constructor for class org.apache.spark.rdd.NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD$
NewHadoopRDD<K,V> - Class in org.apache.spark.rdd: :: DeveloperApi :: An RDD that provides core functionality for reading data stored in Hadoop (e.g., files in HDFS, sources in HBase, or S3), using the new MapReduce API (org.apache.hadoop.mapreduce).
NewHadoopRDD(SparkContext, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, Configuration) - Constructor for class org.apache.spark.rdd.NewHadoopRDD
NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD$ - Class in org.apache.spark.rdd
newId() - Static method in class org.apache.spark.util.AccumulatorContext: Returns a globally unique ID for a new AccumulatorV2.
newInstance() - Method in class org.apache.spark.serializer.JavaSerializer
newInstance() - Method in class org.apache.spark.serializer.KryoSerializer
newInstance() - Method in class org.apache.spark.serializer.Serializer: Creates a new SerializerInstance.
newIntArrayEncoder() - Method in class org.apache.spark.sql.SQLImplicits
newIntEncoder() - Method in class org.apache.spark.sql.SQLImplicits
newIntSeqEncoder() - Method in class org.apache.spark.sql.SQLImplicits: Deprecated.
use newSequenceEncoder
newJavaDecimalEncoder() - Method in class org.apache.spark.sql.SQLImplicits
newKryo() - Method in class org.apache.spark.serializer.KryoSerializer
newKryoOutput() - Method in class org.apache.spark.serializer.KryoSerializer
newLongArrayEncoder() - Method in class org.apache.spark.sql.SQLImplicits
newLongEncoder() - Method in class org.apache.spark.sql.SQLImplicits
newLongSeqEncoder() - Method in class org.apache.spark.sql.SQLImplicits: Deprecated.
use newSequenceEncoder
newMapEncoder(TypeTags.TypeTag<T>) - Method in class org.apache.spark.sql.SQLImplicits
newProductArrayEncoder(TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLImplicits
newProductEncoder(TypeTags.TypeTag<T>) - Method in interface org.apache.spark.sql.LowPrioritySQLImplicits
newProductSeqEncoder(TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLImplicits: Deprecated.
use newSequenceEncoder
newScalaDecimalEncoder() - Method in class org.apache.spark.sql.SQLImplicits
newSequenceEncoder(TypeTags.TypeTag<T>) - Method in class org.apache.spark.sql.SQLImplicits
newSession() - Method in interface org.apache.spark.sql.hive.client.HiveClient: Return a HiveClient as new session, that will share the class loader and Hive client
newSession() - Method in class org.apache.spark.sql.hive.HiveContext: Deprecated.

Returns a new HiveContext as new session, which will have separated SQLConf, UDF/UDAF, temporary tables and SessionState, but sharing the same CacheManager, IsolatedClientLoader and Hive client (both of execution and metadata) with existing HiveContext.
newSession() - Method in class org.apache.spark.sql.SparkSession: Start a new session with isolated SQL configurations, temporary tables, registered functions are isolated, but sharing the underlying SparkContext and cached data.
newSession() - Method in class org.apache.spark.sql.SQLContext: Returns a SQLContext as new session, with separated SQL configurations, temporary tables, registered functions, but sharing the same SparkContext, cached data and other things.
newSetEncoder(TypeTags.TypeTag<T>) - Method in class org.apache.spark.sql.SQLImplicits: Notice that we serialize Set to Catalyst array.
newShortArrayEncoder() - Method in class org.apache.spark.sql.SQLImplicits
newShortEncoder() - Method in class org.apache.spark.sql.SQLImplicits
newShortSeqEncoder() - Method in class org.apache.spark.sql.SQLImplicits: Deprecated.
use newSequenceEncoder
newStringArrayEncoder() - Method in class org.apache.spark.sql.SQLImplicits
newStringEncoder() - Method in class org.apache.spark.sql.SQLImplicits
newStringSeqEncoder() - Method in class org.apache.spark.sql.SQLImplicits: Deprecated.
use newSequenceEncoder
newTaskTempFile(TaskAttemptContext, Option<String>, String) - Method in class org.apache.spark.internal.io.FileCommitProtocol: Notifies the commit protocol to add a new file, and gets back the full path that should be used.
newTaskTempFile(TaskAttemptContext, Option<String>, String) - Method in class org.apache.spark.internal.io.HadoopMapReduceCommitProtocol
newTaskTempFileAbsPath(TaskAttemptContext, String, String) - Method in class org.apache.spark.internal.io.FileCommitProtocol: Similar to newTaskTempFile(), but allows files to committed to an absolute output location.
newTaskTempFileAbsPath(TaskAttemptContext, String, String) - Method in class org.apache.spark.internal.io.HadoopMapReduceCommitProtocol
newTemporaryConfiguration(boolean) - Static method in class org.apache.spark.sql.hive.HiveUtils: Constructs a configuration for hive, where the metastore is located in a temp directory.
newTimeStampEncoder() - Method in class org.apache.spark.sql.SQLImplicits
newVersionExternalTempPath(Path, Configuration, String) - Method in interface org.apache.spark.sql.hive.execution.SaveAsHiveFile
next() - Method in class org.apache.spark.InterruptibleIterator
next() - Method in interface org.apache.spark.mllib.clustering.LDAOptimizer
next() - Method in interface org.apache.spark.sql.sources.v2.reader.InputPartitionReader: Proceed to next record, returns false if there is no more records.
next() - Method in class org.apache.spark.status.LiveRDDPartition
next_day(Column, String) - Static method in class org.apache.spark.sql.functions: Returns the first date which is later than the value of the date column that is on the specified day of the week.
nextValue() - Method in class org.apache.spark.mllib.random.ExponentialGenerator
nextValue() - Method in class org.apache.spark.mllib.random.GammaGenerator
nextValue() - Method in class org.apache.spark.mllib.random.LogNormalGenerator
nextValue() - Method in class org.apache.spark.mllib.random.PoissonGenerator
nextValue() - Method in interface org.apache.spark.mllib.random.RandomDataGenerator: Returns an i.i.d.
nextValue() - Method in class org.apache.spark.mllib.random.StandardNormalGenerator
nextValue() - Method in class org.apache.spark.mllib.random.UniformGenerator
nextValue() - Method in class org.apache.spark.mllib.random.WeibullGenerator
NGram - Class in org.apache.spark.ml.feature: A feature transformer that converts the input array of strings into an array of n-grams.
NGram(String) - Constructor for class org.apache.spark.ml.feature.NGram
NGram() - Constructor for class org.apache.spark.ml.feature.NGram
NioBufferedFileInputStream - Class in org.apache.spark.io: InputStream implementation which uses direct buffer to read a file to avoid extra copy of data between Java and native memory which happens when using BufferedInputStream.
NioBufferedFileInputStream(File, int) - Constructor for class org.apache.spark.io.NioBufferedFileInputStream
NioBufferedFileInputStream(File) - Constructor for class org.apache.spark.io.NioBufferedFileInputStream
NNLS - Class in org.apache.spark.mllib.optimization: Object used to solve nonnegative least squares problems using a modified projected gradient method.
NNLS() - Constructor for class org.apache.spark.mllib.optimization.NNLS
NNLS.Workspace - Class in org.apache.spark.mllib.optimization
NO_PREF() - Static method in class org.apache.spark.scheduler.TaskLocality
NO_RESOURCE - Static variable in class org.apache.spark.launcher.SparkLauncher: A special value for the resource that tells Spark to not try to process the app resource as a file.
Node - Class in org.apache.spark.ml.tree: Decision tree node interface.
Node() - Constructor for class org.apache.spark.ml.tree.Node
Node - Class in org.apache.spark.mllib.tree.model: :: DeveloperApi :: Node in a decision tree.
Node(int, Predict, double, boolean, Option<Split>, Option<Node>, Option<Node>, Option<InformationGainStats>) - Constructor for class org.apache.spark.mllib.tree.model.Node
node() - Method in class org.apache.spark.scheduler.BlacklistedExecutor
NODE_LOCAL() - Static method in class org.apache.spark.scheduler.TaskLocality
nodeBlacklist() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RequestExecutors
NodeData(int, double, double, double[], double, int, int, DecisionTreeModelReadWrite.SplitData) - Constructor for class org.apache.spark.ml.tree.DecisionTreeModelReadWrite.NodeData
nodeData() - Method in class org.apache.spark.ml.tree.EnsembleModelReadWrite.EnsembleNodeData
NodeData(int, int, org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0.PredictData, double, boolean, Option<org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0.SplitData>, Option<Object>, Option<Object>, Option<Object>) - Constructor for class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
NodeData$() - Constructor for class org.apache.spark.ml.tree.DecisionTreeModelReadWrite.NodeData$
NodeData$() - Constructor for class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData$
nodeId() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
noLocality() - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer
Nominal() - Static method in class org.apache.spark.ml.attribute.AttributeType: Nominal type.
NominalAttribute - Class in org.apache.spark.ml.attribute: :: DeveloperApi :: A nominal attribute.
NONE - Static variable in class org.apache.spark.api.java.StorageLevels
None - Static variable in class org.apache.spark.graphx.TripletFields: None of the triplet fields are exposed.
NONE() - Static method in class org.apache.spark.scheduler.SchedulingMode
NONE() - Static method in class org.apache.spark.storage.StorageLevel
nonLocalPaths(String, boolean) - Static method in class org.apache.spark.util.Utils: Return all non-local paths from a comma-separated list of paths.
nonnegative() - Method in interface org.apache.spark.ml.recommendation.ALSParams: Param for whether to apply nonnegativity constraints.
nonNegativeHash(Object) - Static method in class org.apache.spark.util.Utils
nonNegativeMod(int, int) - Static method in class org.apache.spark.util.Utils
NoopDialect - Class in org.apache.spark.sql.jdbc: NOOP dialect object, always returning the neutral element.
NoopDialect() - Constructor for class org.apache.spark.sql.jdbc.NoopDialect
norm(Vector, double) - Static method in class org.apache.spark.ml.linalg.Vectors: Returns the p-norm of this vector.
norm(Vector, double) - Static method in class org.apache.spark.mllib.linalg.Vectors: Returns the p-norm of this vector.
NormalEquationSolver - Interface in org.apache.spark.ml.optim: Interface for classes that solve the normal equations locally.
normalizeDuration(long) - Static method in class org.apache.spark.streaming.ui.UIUtils: Find the best TimeUnit for converting milliseconds to a friendly string.
Normalizer - Class in org.apache.spark.ml.feature: Normalize a vector to have unit norm using the given p-norm.
Normalizer(String) - Constructor for class org.apache.spark.ml.feature.Normalizer
Normalizer() - Constructor for class org.apache.spark.ml.feature.Normalizer
Normalizer - Class in org.apache.spark.mllib.feature: Normalizes samples individually to unit L^p^ norm
Normalizer(double) - Constructor for class org.apache.spark.mllib.feature.Normalizer
Normalizer() - Constructor for class org.apache.spark.mllib.feature.Normalizer
normalizeToProbabilitiesInPlace(DenseVector) - Static method in class org.apache.spark.ml.classification.ProbabilisticClassificationModel: Normalize a vector of raw predictions to be a multinomial probability vector, in place.
normalJavaRDD(JavaSparkContext, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Java-friendly version of RandomRDDs.normalRDD.
normalJavaRDD(JavaSparkContext, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.normalJavaRDD with the default seed.
normalJavaRDD(JavaSparkContext, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.normalJavaRDD with the default number of partitions and the default seed.
normalJavaVectorRDD(JavaSparkContext, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Java-friendly version of RandomRDDs.normalVectorRDD.
normalJavaVectorRDD(JavaSparkContext, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.normalJavaVectorRDD with the default seed.
normalJavaVectorRDD(JavaSparkContext, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.normalJavaVectorRDD with the default number of partitions and the default seed.
normalRDD(SparkContext, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Generates an RDD comprised of i.i.d. samples from the standard normal distribution.
normalVectorRDD(SparkContext, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Generates an RDD[Vector] with vectors containing i.i.d. samples drawn from the standard normal distribution.
normL1(Column, Column) - Static method in class org.apache.spark.ml.stat.Summarizer
normL1(Column) - Static method in class org.apache.spark.ml.stat.Summarizer
normL1() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer: L1 norm of each dimension.
normL1() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary: L1 norm of each column
normL2(Column, Column) - Static method in class org.apache.spark.ml.stat.Summarizer
normL2(Column) - Static method in class org.apache.spark.ml.stat.Summarizer
normL2() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer: L2 (Euclidean) norm of each dimension.
normL2() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary: Euclidean magnitude of each column
normPdf(double, double, double, double) - Static method in class org.apache.spark.mllib.stat.KernelDensity: Evaluates the PDF of a normal distribution.
not(Function0<Parsers.Parser<T>>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
not(Column) - Static method in class org.apache.spark.sql.functions: Inversion of boolean expression, i.e.
Not - Class in org.apache.spark.sql.sources: A filter that evaluates to true iff child is evaluated to false.
Not(Filter) - Constructor for class org.apache.spark.sql.sources.Not
notEqual(Object) - Method in class org.apache.spark.sql.Column: Inequality test.
NoTimeout() - Static method in class org.apache.spark.sql.streaming.GroupStateTimeout: No timeout.
ntile(int) - Static method in class org.apache.spark.sql.functions: Window function: returns the ntile group id (from 1 to n inclusive) in an ordered window partition.
nullable() - Method in class org.apache.spark.sql.catalog.Column
nullable() - Method in class org.apache.spark.sql.expressions.UserDefinedFunction: Returns true when the UDF can return a nullable value.
nullable() - Method in class org.apache.spark.sql.types.StructField
nullDeviance() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionSummary: The deviance for the null model.
nullHypothesis() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
nullHypothesis() - Method in class org.apache.spark.mllib.stat.test.KolmogorovSmirnovTestResult
nullHypothesis() - Method in interface org.apache.spark.mllib.stat.test.StreamingTestMethod
nullHypothesis() - Static method in class org.apache.spark.mllib.stat.test.StudentTTest
nullHypothesis() - Method in interface org.apache.spark.mllib.stat.test.TestResult: Null hypothesis of the test.
nullHypothesis() - Static method in class org.apache.spark.mllib.stat.test.WelchTTest
NullHypothesis$() - Constructor for class org.apache.spark.mllib.stat.test.ChiSqTest.NullHypothesis$
NullHypothesis$() - Constructor for class org.apache.spark.mllib.stat.test.KolmogorovSmirnovTest.NullHypothesis$
NullType - Static variable in class org.apache.spark.sql.types.DataTypes: Gets the NullType object.
NullType - Class in org.apache.spark.sql.types: The data type representing NULL values.
NullType() - Constructor for class org.apache.spark.sql.types.NullType
NUM_ATTRIBUTES() - Static method in class org.apache.spark.ml.attribute.AttributeKeys
NUM_PARTITIONS() - Static method in class org.apache.spark.ui.UIWorkloadGenerator
NUM_VALUES() - Static method in class org.apache.spark.ml.attribute.AttributeKeys
numAccums() - Static method in class org.apache.spark.util.AccumulatorContext: Returns the number of accumulators registered.
numActiveBatches() - Method in class org.apache.spark.status.api.v1.streaming.StreamingStatistics
numActiveOutputOps() - Method in class org.apache.spark.status.api.v1.streaming.BatchInfo
numActiveReceivers() - Method in class org.apache.spark.status.api.v1.streaming.StreamingStatistics
numActives() - Method in class org.apache.spark.ml.linalg.DenseMatrix
numActives() - Method in class org.apache.spark.ml.linalg.DenseVector
numActives() - Method in interface org.apache.spark.ml.linalg.Matrix: Find the number of values stored explicitly.
numActives() - Method in class org.apache.spark.ml.linalg.SparseMatrix
numActives() - Method in class org.apache.spark.ml.linalg.SparseVector
numActives() - Method in interface org.apache.spark.ml.linalg.Vector: Number of active entries.
numActives() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
numActives() - Method in class org.apache.spark.mllib.linalg.DenseVector
numActives() - Method in interface org.apache.spark.mllib.linalg.Matrix: Find the number of values stored explicitly.
numActives() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
numActives() - Method in class org.apache.spark.mllib.linalg.SparseVector
numActives() - Method in interface org.apache.spark.mllib.linalg.Vector: Number of active entries.
numActiveStages() - Method in class org.apache.spark.status.api.v1.JobData
numActiveTasks() - Method in interface org.apache.spark.SparkStageInfo
numActiveTasks() - Method in class org.apache.spark.SparkStageInfoImpl
numActiveTasks() - Method in class org.apache.spark.status.api.v1.JobData
numActiveTasks() - Method in class org.apache.spark.status.api.v1.StageData
numAttributes() - Method in class org.apache.spark.ml.attribute.AttributeGroup
numAvailableOutputs() - Method in class org.apache.spark.ShuffleStatus: Number of partitions that have shuffle outputs.
numBins() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
numBuckets() - Method in interface org.apache.spark.ml.feature.QuantileDiscretizerBase: Number of buckets (quantiles, or categories) into which data points are grouped.
numBucketsArray() - Method in interface org.apache.spark.ml.feature.QuantileDiscretizerBase: Array of number of buckets (quantiles, or categories) into which data points are grouped.
numCachedPartitions() - Method in class org.apache.spark.status.api.v1.RDDStorageInfo
numCachedPartitions() - Method in class org.apache.spark.storage.RDDInfo
numCategories() - Method in class org.apache.spark.ml.tree.CategoricalSplit
numCategories() - Method in class org.apache.spark.ml.tree.DecisionTreeModelReadWrite.SplitData
numClasses() - Method in class org.apache.spark.ml.classification.ClassificationModel: Number of classes (values which the label can take).
numClasses() - Method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
numClasses() - Method in class org.apache.spark.ml.classification.GBTClassificationModel
numClasses() - Method in class org.apache.spark.ml.classification.LinearSVCModel
numClasses() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
numClasses() - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassificationModel
numClasses() - Method in class org.apache.spark.ml.classification.NaiveBayesModel
numClasses() - Method in class org.apache.spark.ml.classification.OneVsRestModel
numClasses() - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel
numClasses() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
numClasses() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
numColBlocks() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
numCols() - Method in class org.apache.spark.ml.linalg.DenseMatrix
numCols() - Method in interface org.apache.spark.ml.linalg.Matrix: Number of columns.
numCols() - Method in class org.apache.spark.ml.linalg.SparseMatrix
numCols() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
numCols() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
numCols() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix: Gets or computes the number of columns.
numCols() - Method in interface org.apache.spark.mllib.linalg.distributed.DistributedMatrix: Gets or computes the number of columns.
numCols() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
numCols() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Gets or computes the number of columns.
numCols() - Method in interface org.apache.spark.mllib.linalg.Matrix: Number of columns.
numCols() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
numCols() - Method in class org.apache.spark.sql.vectorized.ColumnarBatch: Returns the number of columns that make up this batch.
numCompletedIndices() - Method in class org.apache.spark.status.api.v1.JobData
numCompletedIndices() - Method in class org.apache.spark.status.api.v1.StageData
numCompletedOutputOps() - Method in class org.apache.spark.status.api.v1.streaming.BatchInfo
numCompletedStages() - Method in class org.apache.spark.status.api.v1.JobData
numCompletedTasks() - Method in interface org.apache.spark.SparkStageInfo
numCompletedTasks() - Method in class org.apache.spark.SparkStageInfoImpl
numCompletedTasks() - Method in class org.apache.spark.status.api.v1.JobData
numCompleteTasks() - Method in class org.apache.spark.status.api.v1.StageData
numEdges() - Method in class org.apache.spark.graphx.GraphOps: The number of edges in the graph.
numElements() - Method in class org.apache.spark.sql.vectorized.ColumnarArray
numElements() - Method in class org.apache.spark.sql.vectorized.ColumnarMap
Numeric() - Static method in class org.apache.spark.ml.attribute.AttributeType: Numeric type.
NumericAttribute - Class in org.apache.spark.ml.attribute: :: DeveloperApi :: A numeric attribute with optional summary statistics.
NumericParser - Class in org.apache.spark.mllib.util: Simple parser for a numeric structure consisting of three types:
NumericParser() - Constructor for class org.apache.spark.mllib.util.NumericParser
numericRDDToDoubleRDDFunctions(RDD<T>, Numeric<T>) - Static method in class org.apache.spark.rdd.RDD
NumericType - Class in org.apache.spark.sql.types: Numeric data types.
NumericType() - Constructor for class org.apache.spark.sql.types.NumericType
numFailedOutputOps() - Method in class org.apache.spark.status.api.v1.streaming.BatchInfo
numFailedStages() - Method in class org.apache.spark.status.api.v1.JobData
numFailedTasks() - Method in interface org.apache.spark.SparkStageInfo
numFailedTasks() - Method in class org.apache.spark.SparkStageInfoImpl
numFailedTasks() - Method in class org.apache.spark.status.api.v1.JobData
numFailedTasks() - Method in class org.apache.spark.status.api.v1.StageData
numFalseNegatives() - Method in interface org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrix: number of false negatives
numFalsePositives() - Method in interface org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrix: number of false positives
numFeatures() - Method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
numFeatures() - Method in class org.apache.spark.ml.classification.GBTClassificationModel
numFeatures() - Method in class org.apache.spark.ml.classification.LinearSVCModel
numFeatures() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
numFeatures() - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassificationModel
numFeatures() - Method in class org.apache.spark.ml.classification.NaiveBayesModel
numFeatures() - Method in class org.apache.spark.ml.classification.OneVsRestModel
numFeatures() - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel
numFeatures() - Method in class org.apache.spark.ml.feature.FeatureHasher: Number of features.
numFeatures() - Method in class org.apache.spark.ml.feature.HashingTF: Number of features.
numFeatures() - Method in class org.apache.spark.ml.feature.VectorIndexerModel
numFeatures() - Method in class org.apache.spark.ml.PredictionModel: Returns the number of features the model was trained on.
numFeatures() - Method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
numFeatures() - Method in class org.apache.spark.ml.regression.GBTRegressionModel
numFeatures() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionModel
numFeatures() - Method in class org.apache.spark.ml.regression.LinearRegressionModel
numFeatures() - Method in class org.apache.spark.ml.regression.RandomForestRegressionModel
numFeatures() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
numFeatures() - Method in class org.apache.spark.mllib.feature.HashingTF
numFields() - Method in class org.apache.spark.sql.vectorized.ColumnarRow
numFolds() - Method in interface org.apache.spark.ml.tuning.CrossValidatorParams: Param for number of folds for cross validation.
numHashTables() - Method in interface org.apache.spark.ml.feature.LSHParams: Param for the number of hash tables used in LSH OR-amplification.
numInactiveReceivers() - Method in class org.apache.spark.status.api.v1.streaming.StreamingStatistics
numInputRows() - Method in class org.apache.spark.sql.streaming.SourceProgress
numInputRows() - Method in class org.apache.spark.sql.streaming.StreamingQueryProgress: The aggregate (across all sources) number of records processed in a trigger.
numInstances() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionSummary: Number of instances in DataFrame predictions.
numInstances() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary: Number of instances in DataFrame predictions
numItemBlocks() - Method in interface org.apache.spark.ml.recommendation.ALSParams: Param for number of item blocks (positive).
numIter() - Method in class org.apache.spark.ml.clustering.ClusteringSummary
numIterations() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionTrainingSummary
numIterations() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
numKilledTasks() - Method in class org.apache.spark.status.api.v1.JobData
numKilledTasks() - Method in class org.apache.spark.status.api.v1.StageData
numNegatives() - Method in interface org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrix: number of negatives
numNodes() - Method in interface org.apache.spark.ml.tree.DecisionTreeModel: Number of nodes in tree, including leaf nodes.
numNodes() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel: Get number of nodes in tree, including leaf nodes.
numNonzeros() - Method in class org.apache.spark.ml.linalg.DenseMatrix
numNonzeros() - Method in class org.apache.spark.ml.linalg.DenseVector
numNonzeros() - Method in interface org.apache.spark.ml.linalg.Matrix: Find the number of non-zero active values.
numNonzeros() - Method in class org.apache.spark.ml.linalg.SparseMatrix
numNonzeros() - Method in class org.apache.spark.ml.linalg.SparseVector
numNonzeros() - Method in interface org.apache.spark.ml.linalg.Vector: Number of nonzero elements.
numNonZeros(Column, Column) - Static method in class org.apache.spark.ml.stat.Summarizer
numNonZeros(Column) - Static method in class org.apache.spark.ml.stat.Summarizer
numNonzeros() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
numNonzeros() - Method in class org.apache.spark.mllib.linalg.DenseVector
numNonzeros() - Method in interface org.apache.spark.mllib.linalg.Matrix: Find the number of non-zero active values.
numNonzeros() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
numNonzeros() - Method in class org.apache.spark.mllib.linalg.SparseVector
numNonzeros() - Method in interface org.apache.spark.mllib.linalg.Vector: Number of nonzero elements.
numNonzeros() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer: Number of nonzero elements in each dimension.
numNonzeros() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary: Number of nonzero elements (including explicitly presented zero values) in each column.
numNulls() - Method in class org.apache.spark.sql.vectorized.ArrowColumnVector
numNulls() - Method in class org.apache.spark.sql.vectorized.ColumnVector: Returns the number of nulls in this column vector.
numOfPoints() - Method in class org.apache.spark.ml.evaluation.SquaredEuclideanSilhouette.ClusterStats
numPartitions() - Method in class org.apache.spark.HashPartitioner
numPartitions() - Method in interface org.apache.spark.ml.feature.Word2VecBase: Number of partitions for sentences of words.
numPartitions() - Method in interface org.apache.spark.ml.fpm.FPGrowthParams: Number of partitions (at least 1) used by parallel FP-growth.
numPartitions() - Method in class org.apache.spark.Partitioner
numPartitions() - Method in class org.apache.spark.RangePartitioner
numPartitions() - Method in class org.apache.spark.rdd.PartitionGroup
numPartitions() - Method in interface org.apache.spark.sql.sources.v2.reader.partitioning.Partitioning: Returns the number of partitions(i.e., InputPartitions) the data source outputs.
numPartitions() - Method in class org.apache.spark.status.api.v1.RDDStorageInfo
numPartitions() - Method in class org.apache.spark.storage.RDDInfo
numPartitions(int) - Method in class org.apache.spark.streaming.StateSpec: Set the number of partitions by which the state RDDs generated by mapWithState will be partitioned.
numPositives() - Method in interface org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrix: number of positives
numProcessedRecords() - Method in class org.apache.spark.status.api.v1.streaming.StreamingStatistics
numReceivedRecords() - Method in class org.apache.spark.status.api.v1.streaming.StreamingStatistics
numReceivers() - Method in class org.apache.spark.status.api.v1.streaming.StreamingStatistics
numRecords() - Method in interface org.apache.spark.streaming.receiver.ReceivedBlockStoreResult
numRecords() - Method in class org.apache.spark.streaming.scheduler.BatchInfo: The number of recorders received by the receivers in this batch.
numRecords() - Method in class org.apache.spark.streaming.scheduler.StreamInputInfo
numRetainedCompletedBatches() - Method in class org.apache.spark.status.api.v1.streaming.StreamingStatistics
numRetries(SparkConf) - Static method in class org.apache.spark.util.RpcUtils: Returns the configured number of times to retry connecting
numRowBlocks() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
numRows() - Method in class org.apache.spark.ml.linalg.DenseMatrix
numRows() - Method in interface org.apache.spark.ml.linalg.Matrix: Number of rows.
numRows() - Method in class org.apache.spark.ml.linalg.SparseMatrix
numRows() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
numRows() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
numRows() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix: Gets or computes the number of rows.
numRows() - Method in interface org.apache.spark.mllib.linalg.distributed.DistributedMatrix: Gets or computes the number of rows.
numRows() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
numRows() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Gets or computes the number of rows.
numRows() - Method in interface org.apache.spark.mllib.linalg.Matrix: Number of rows.
numRows() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
numRows() - Method in interface org.apache.spark.sql.sources.v2.reader.Statistics
numRows() - Method in class org.apache.spark.sql.vectorized.ColumnarBatch: Returns the number of rows for read, including filtered rows.
numRowsTotal() - Method in class org.apache.spark.sql.streaming.StateOperatorProgress
numRowsUpdated() - Method in class org.apache.spark.sql.streaming.StateOperatorProgress
numRunningTasks() - Method in interface org.apache.spark.SparkExecutorInfo
numRunningTasks() - Method in class org.apache.spark.SparkExecutorInfoImpl
numSkippedStages() - Method in class org.apache.spark.status.api.v1.JobData
numSkippedTasks() - Method in class org.apache.spark.status.api.v1.JobData
numSpilledStages() - Method in class org.apache.spark.SpillListener
numStreamBlocks() - Method in class org.apache.spark.ui.storage.ExecutorStreamSummary
numTasks() - Method in class org.apache.spark.scheduler.StageInfo
numTasks() - Method in interface org.apache.spark.SparkStageInfo
numTasks() - Method in class org.apache.spark.SparkStageInfoImpl
numTasks() - Method in class org.apache.spark.status.api.v1.JobData
numTasks() - Method in class org.apache.spark.status.api.v1.StageData
numTopFeatures() - Method in interface org.apache.spark.ml.feature.ChiSqSelectorParams: Number of features that selector will select, ordered by ascending p-value.
numTopFeatures() - Method in class org.apache.spark.mllib.feature.ChiSqSelector
numTotalCompletedBatches() - Method in class org.apache.spark.status.api.v1.streaming.StreamingStatistics
numTotalOutputOps() - Method in class org.apache.spark.status.api.v1.streaming.BatchInfo
numTrees() - Method in class org.apache.spark.ml.classification.GBTClassificationModel: Deprecated.
Use getNumTrees instead. This method will be removed in 3.0.0.
numTrees() - Method in class org.apache.spark.ml.regression.GBTRegressionModel: Deprecated.
Use getNumTrees instead. This method will be removed in 3.0.0.
numTrees() - Method in interface org.apache.spark.ml.tree.RandomForestParams: Number of trees to train (at least 1).
numTrueNegatives() - Method in interface org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrix: number of true negatives
numTruePositives() - Method in interface org.apache.spark.mllib.evaluation.binary.BinaryConfusionMatrix: number of true positives
numUserBlocks() - Method in interface org.apache.spark.ml.recommendation.ALSParams: Param for number of user blocks (positive).
numValues() - Method in class org.apache.spark.ml.attribute.NominalAttribute
numVertices() - Method in class org.apache.spark.graphx.GraphOps: The number of vertices in the graph.

O

obj() - Method in class org.apache.spark.internal.io.FileCommitProtocol.TaskCommitMessage
objectFile(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Load an RDD saved as a SequenceFile containing serialized objects, with NullWritable keys and BytesWritable values that contain a serialized partition.
objectFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Load an RDD saved as a SequenceFile containing serialized objects, with NullWritable keys and BytesWritable values that contain a serialized partition.
objectFile(String, int, ClassTag<T>) - Method in class org.apache.spark.SparkContext: Load an RDD saved as a SequenceFile containing serialized objects, with NullWritable keys and BytesWritable values that contain a serialized partition.
objectiveHistory() - Method in class org.apache.spark.ml.classification.BinaryLogisticRegressionTrainingSummaryImpl
objectiveHistory() - Method in interface org.apache.spark.ml.classification.LogisticRegressionTrainingSummary: objective function (scaled loss + regularization) at each iteration.
objectiveHistory() - Method in class org.apache.spark.ml.classification.LogisticRegressionTrainingSummaryImpl
objectiveHistory() - Method in class org.apache.spark.ml.regression.LinearRegressionTrainingSummary
ObjectStreamClassMethods(ObjectStreamClass) - Constructor for class org.apache.spark.serializer.SerializationDebugger.ObjectStreamClassMethods
ObjectStreamClassMethods$() - Constructor for class org.apache.spark.serializer.SerializationDebugger.ObjectStreamClassMethods$
ObjectType - Class in org.apache.spark.sql.types
ObjectType(Class<?>) - Constructor for class org.apache.spark.sql.types.ObjectType
ocvTypes() - Static method in class org.apache.spark.ml.image.ImageSchema: (Scala-specific) OpenCV type mapping supported
of(T) - Static method in class org.apache.spark.api.java.Optional
of(RDD<Tuple2<Object, Object>>) - Static method in class org.apache.spark.mllib.evaluation.AreaUnderCurve: Returns the area under the given curve.
of(Iterable<Tuple2<Object, Object>>) - Static method in class org.apache.spark.mllib.evaluation.AreaUnderCurve: Returns the area under the given curve.
of(JavaRDD<Tuple2<T, T>>) - Static method in class org.apache.spark.mllib.evaluation.RankingMetrics: Creates a RankingMetrics instance (for Java users).
OFF_HEAP - Static variable in class org.apache.spark.api.java.StorageLevels
OFF_HEAP() - Static method in class org.apache.spark.storage.StorageLevel
offHeapMemoryRemaining() - Method in class org.apache.spark.status.api.v1.RDDDataDistribution
offHeapMemoryUsed() - Method in class org.apache.spark.status.api.v1.RDDDataDistribution
offHeapUsed() - Method in class org.apache.spark.status.LiveRDDDistribution
Offset - Class in org.apache.spark.sql.sources.v2.reader.streaming: An abstract representation of progress through a MicroBatchReader or ContinuousReader.
Offset() - Constructor for class org.apache.spark.sql.sources.v2.reader.streaming.Offset
offsetBytes(String, long, long, long) - Static method in class org.apache.spark.util.Utils: Return a string containing part of a file from byte 'start' to 'end'.
offsetBytes(Seq<File>, Seq<Object>, long, long) - Static method in class org.apache.spark.util.Utils: Return a string containing data across a set of files.
offsetCol() - Method in interface org.apache.spark.ml.regression.GeneralizedLinearRegressionBase: Param for offset column name.
ofNullable(T) - Static method in class org.apache.spark.api.java.Optional
ofRows(SparkSession, LogicalPlan) - Static method in class org.apache.spark.sql.Dataset
oldVersionExternalTempPath(Path, Configuration, String) - Method in interface org.apache.spark.sql.hive.execution.SaveAsHiveFile
onAddData(Object, Object) - Method in interface org.apache.spark.streaming.receiver.BlockGeneratorListener: Called after a data item is added into the BlockGenerator.
onApplicationEnd(SparkListenerApplicationEnd) - Method in class org.apache.spark.scheduler.SparkListener
onApplicationEnd(SparkListenerApplicationEnd) - Method in interface org.apache.spark.scheduler.SparkListenerInterface: Called when the application ends
onApplicationEnd(SparkListenerApplicationEnd) - Method in class org.apache.spark.SparkFirehoseListener
onApplicationStart(SparkListenerApplicationStart) - Method in class org.apache.spark.scheduler.SparkListener
onApplicationStart(SparkListenerApplicationStart) - Method in interface org.apache.spark.scheduler.SparkListenerInterface: Called when the application starts
onApplicationStart(SparkListenerApplicationStart) - Method in class org.apache.spark.SparkFirehoseListener
onBatchCompleted(JavaStreamingListenerBatchCompleted) - Method in interface org.apache.spark.streaming.api.java.PythonStreamingListener: Called when processing of a batch of jobs has completed.
onBatchCompleted(StreamingListenerBatchCompleted) - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
onBatchCompleted(StreamingListenerBatchCompleted) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener: Called when processing of a batch of jobs has completed.
onBatchStarted(JavaStreamingListenerBatchStarted) - Method in interface org.apache.spark.streaming.api.java.PythonStreamingListener: Called when processing of a batch of jobs has started.
onBatchStarted(StreamingListenerBatchStarted) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener: Called when processing of a batch of jobs has started.
onBatchSubmitted(JavaStreamingListenerBatchSubmitted) - Method in interface org.apache.spark.streaming.api.java.PythonStreamingListener: Called when a batch of jobs has been submitted for processing.
onBatchSubmitted(StreamingListenerBatchSubmitted) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener: Called when a batch of jobs has been submitted for processing.
onBlockManagerAdded(SparkListenerBlockManagerAdded) - Method in class org.apache.spark.scheduler.SparkListener
onBlockManagerAdded(SparkListenerBlockManagerAdded) - Method in interface org.apache.spark.scheduler.SparkListenerInterface: Called when a new block manager has joined
onBlockManagerAdded(SparkListenerBlockManagerAdded) - Method in class org.apache.spark.SparkFirehoseListener
onBlockManagerRemoved(SparkListenerBlockManagerRemoved) - Method in class org.apache.spark.scheduler.SparkListener
onBlockManagerRemoved(SparkListenerBlockManagerRemoved) - Method in interface org.apache.spark.scheduler.SparkListenerInterface: Called when an existing block manager has been removed
onBlockManagerRemoved(SparkListenerBlockManagerRemoved) - Method in class org.apache.spark.SparkFirehoseListener
onBlockUpdated(SparkListenerBlockUpdated) - Method in class org.apache.spark.scheduler.SparkListener
onBlockUpdated(SparkListenerBlockUpdated) - Method in interface org.apache.spark.scheduler.SparkListenerInterface: Called when the driver receives a block update info.
onBlockUpdated(SparkListenerBlockUpdated) - Method in class org.apache.spark.SparkFirehoseListener
Once() - Static method in class org.apache.spark.sql.streaming.Trigger: A trigger that process only one batch of data in a streaming query then terminates the query.
OnceParser(Function1<Reader<Object>, Parsers.ParseResult<T>>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
onComplete(Function1<Try<T>, U>, ExecutionContext) - Method in class org.apache.spark.ComplexFutureAction
onComplete(Function1<Try<T>, U>, ExecutionContext) - Method in interface org.apache.spark.FutureAction: When this action is completed, either through an exception, or a value, applies the provided function.
onComplete(Function1<R, BoxedUnit>) - Method in class org.apache.spark.partial.PartialResult: Set a handler to be called when this PartialResult completes.
onComplete(Function1<Try<T>, U>, ExecutionContext) - Method in class org.apache.spark.SimpleFutureAction
onConnected(RpcAddress) - Method in interface org.apache.spark.rpc.RpcEndpoint: Invoked when remoteAddress is connected to the current node.
onDataWriterCommit(WriterCommitMessage) - Method in interface org.apache.spark.sql.sources.v2.writer.DataSourceWriter: Handles a commit message on receiving from a successful data writer.
onDisconnected(RpcAddress) - Method in interface org.apache.spark.rpc.RpcEndpoint: Invoked when remoteAddress is lost.
OneHotEncoder - Class in org.apache.spark.ml.feature: Deprecated.
OneHotEncoderEstimator will be renamed OneHotEncoder and this OneHotEncoder will be removed in 3.0.0.
OneHotEncoder(String) - Constructor for class org.apache.spark.ml.feature.OneHotEncoder: Deprecated.
OneHotEncoder() - Constructor for class org.apache.spark.ml.feature.OneHotEncoder: Deprecated.
OneHotEncoderBase - Interface in org.apache.spark.ml.feature: Private trait for params and common methods for OneHotEncoderEstimator and OneHotEncoderModel
OneHotEncoderCommon - Class in org.apache.spark.ml.feature: Provides some helper methods used by both OneHotEncoder and OneHotEncoderEstimator.
OneHotEncoderCommon() - Constructor for class org.apache.spark.ml.feature.OneHotEncoderCommon
OneHotEncoderEstimator - Class in org.apache.spark.ml.feature: A one-hot encoder that maps a column of category indices to a column of binary vectors, with at most a single one-value per row that indicates the input category index.
OneHotEncoderEstimator(String) - Constructor for class org.apache.spark.ml.feature.OneHotEncoderEstimator
OneHotEncoderEstimator() - Constructor for class org.apache.spark.ml.feature.OneHotEncoderEstimator
OneHotEncoderModel - Class in org.apache.spark.ml.feature: param: categorySizes Original number of categories for each feature being encoded.
onEnvironmentUpdate(SparkListenerEnvironmentUpdate) - Method in class org.apache.spark.scheduler.SparkListener
onEnvironmentUpdate(SparkListenerEnvironmentUpdate) - Method in interface org.apache.spark.scheduler.SparkListenerInterface: Called when environment properties have been updated
onEnvironmentUpdate(SparkListenerEnvironmentUpdate) - Method in class org.apache.spark.SparkFirehoseListener
onError(Throwable) - Method in interface org.apache.spark.rpc.RpcEndpoint: Invoked when any exception is thrown during handling messages.
onError(String, Throwable) - Method in interface org.apache.spark.streaming.receiver.BlockGeneratorListener: Called when an error has occurred in the BlockGenerator.
ones(int, int) - Static method in class org.apache.spark.ml.linalg.DenseMatrix: Generate a DenseMatrix consisting of ones.
ones(int, int) - Static method in class org.apache.spark.ml.linalg.Matrices: Generate a DenseMatrix consisting of ones.
ones(int, int) - Static method in class org.apache.spark.mllib.linalg.DenseMatrix: Generate a DenseMatrix consisting of ones.
ones(int, int) - Static method in class org.apache.spark.mllib.linalg.Matrices: Generate a DenseMatrix consisting of ones.
OneSampleTwoSided() - Method in class org.apache.spark.mllib.stat.test.KolmogorovSmirnovTest.NullHypothesis$
OneToOneDependency<T> - Class in org.apache.spark: :: DeveloperApi :: Represents a one-to-one dependency between partitions of the parent and child RDDs.
OneToOneDependency(RDD<T>) - Constructor for class org.apache.spark.OneToOneDependency
onEvent(SparkListenerEvent) - Method in class org.apache.spark.SparkFirehoseListener
OneVsRest - Class in org.apache.spark.ml.classification: Reduction of Multiclass Classification to Binary Classification.
OneVsRest(String) - Constructor for class org.apache.spark.ml.classification.OneVsRest
OneVsRest() - Constructor for class org.apache.spark.ml.classification.OneVsRest
OneVsRestModel - Class in org.apache.spark.ml.classification: Model produced by OneVsRest.
OneVsRestParams - Interface in org.apache.spark.ml.classification: Params for OneVsRest.
onExecutorAdded(SparkListenerExecutorAdded) - Method in class org.apache.spark.scheduler.SparkListener
onExecutorAdded(SparkListenerExecutorAdded) - Method in interface org.apache.spark.scheduler.SparkListenerInterface: Called when the driver registers a new executor.
onExecutorAdded(SparkListenerExecutorAdded) - Method in class org.apache.spark.SparkFirehoseListener
onExecutorBlacklisted(SparkListenerExecutorBlacklisted) - Method in class org.apache.spark.scheduler.SparkListener
onExecutorBlacklisted(SparkListenerExecutorBlacklisted) - Method in interface org.apache.spark.scheduler.SparkListenerInterface: Called when the driver blacklists an executor for a Spark application.
onExecutorBlacklisted(SparkListenerExecutorBlacklisted) - Method in class org.apache.spark.SparkFirehoseListener
onExecutorBlacklistedForStage(SparkListenerExecutorBlacklistedForStage) - Method in class org.apache.spark.scheduler.SparkListener
onExecutorBlacklistedForStage(SparkListenerExecutorBlacklistedForStage) - Method in interface org.apache.spark.scheduler.SparkListenerInterface: Called when the driver blacklists an executor for a stage.
onExecutorBlacklistedForStage(SparkListenerExecutorBlacklistedForStage) - Method in class org.apache.spark.SparkFirehoseListener
onExecutorMetricsUpdate(SparkListenerExecutorMetricsUpdate) - Method in class org.apache.spark.scheduler.SparkListener
onExecutorMetricsUpdate(SparkListenerExecutorMetricsUpdate) - Method in interface org.apache.spark.scheduler.SparkListenerInterface: Called when the driver receives task metrics from an executor in a heartbeat.
onExecutorMetricsUpdate(SparkListenerExecutorMetricsUpdate) - Method in class org.apache.spark.SparkFirehoseListener
onExecutorRemoved(SparkListenerExecutorRemoved) - Method in class org.apache.spark.scheduler.SparkListener
onExecutorRemoved(SparkListenerExecutorRemoved) - Method in interface org.apache.spark.scheduler.SparkListenerInterface: Called when the driver removes an executor.
onExecutorRemoved(SparkListenerExecutorRemoved) - Method in class org.apache.spark.SparkFirehoseListener
onExecutorUnblacklisted(SparkListenerExecutorUnblacklisted) - Method in class org.apache.spark.scheduler.SparkListener
onExecutorUnblacklisted(SparkListenerExecutorUnblacklisted) - Method in interface org.apache.spark.scheduler.SparkListenerInterface: Called when the driver re-enables a previously blacklisted executor.
onExecutorUnblacklisted(SparkListenerExecutorUnblacklisted) - Method in class org.apache.spark.SparkFirehoseListener
onFail(Function1<Exception, BoxedUnit>) - Method in class org.apache.spark.partial.PartialResult: Set a handler to be called if this PartialResult's job fails.
onFailure(Throwable) - Method in interface org.apache.spark.rpc.netty.OutboxMessage
onFailure(String, QueryExecution, Exception) - Method in interface org.apache.spark.sql.util.QueryExecutionListener: A callback function that will be called when a query execution failed.
onGenerateBlock(StreamBlockId) - Method in interface org.apache.spark.streaming.receiver.BlockGeneratorListener: Called when a new block of data is generated by the block generator.
onHeapMemoryRemaining() - Method in class org.apache.spark.status.api.v1.RDDDataDistribution
onHeapMemoryUsed() - Method in class org.apache.spark.status.api.v1.RDDDataDistribution
onHeapUsed() - Method in class org.apache.spark.status.LiveRDDDistribution
onJobEnd(SparkListenerJobEnd) - Method in class org.apache.spark.scheduler.SparkListener
onJobEnd(SparkListenerJobEnd) - Method in interface org.apache.spark.scheduler.SparkListenerInterface: Called when a job ends
onJobEnd(SparkListenerJobEnd) - Method in class org.apache.spark.SparkFirehoseListener
onJobStart(SparkListenerJobStart) - Method in class org.apache.spark.scheduler.SparkListener
onJobStart(SparkListenerJobStart) - Method in interface org.apache.spark.scheduler.SparkListenerInterface: Called when a job starts
onJobStart(SparkListenerJobStart) - Method in class org.apache.spark.SparkFirehoseListener
OnlineLDAOptimizer - Class in org.apache.spark.mllib.clustering: :: DeveloperApi ::
OnlineLDAOptimizer() - Constructor for class org.apache.spark.mllib.clustering.OnlineLDAOptimizer
onNetworkError(Throwable, RpcAddress) - Method in interface org.apache.spark.rpc.RpcEndpoint: Invoked when some network error happens in the connection between the current node and remoteAddress.
onNodeBlacklisted(SparkListenerNodeBlacklisted) - Method in class org.apache.spark.scheduler.SparkListener
onNodeBlacklisted(SparkListenerNodeBlacklisted) - Method in interface org.apache.spark.scheduler.SparkListenerInterface: Called when the driver blacklists a node for a Spark application.
onNodeBlacklisted(SparkListenerNodeBlacklisted) - Method in class org.apache.spark.SparkFirehoseListener
onNodeBlacklistedForStage(SparkListenerNodeBlacklistedForStage) - Method in class org.apache.spark.scheduler.SparkListener
onNodeBlacklistedForStage(SparkListenerNodeBlacklistedForStage) - Method in interface org.apache.spark.scheduler.SparkListenerInterface: Called when the driver blacklists a node for a stage.
onNodeBlacklistedForStage(SparkListenerNodeBlacklistedForStage) - Method in class org.apache.spark.SparkFirehoseListener
onNodeUnblacklisted(SparkListenerNodeUnblacklisted) - Method in class org.apache.spark.scheduler.SparkListener
onNodeUnblacklisted(SparkListenerNodeUnblacklisted) - Method in interface org.apache.spark.scheduler.SparkListenerInterface: Called when the driver re-enables a previously blacklisted node.
onNodeUnblacklisted(SparkListenerNodeUnblacklisted) - Method in class org.apache.spark.SparkFirehoseListener
onOtherEvent(SparkListenerEvent) - Method in class org.apache.spark.scheduler.SparkListener
onOtherEvent(SparkListenerEvent) - Method in interface org.apache.spark.scheduler.SparkListenerInterface: Called when other events like SQL-specific events are posted.
onOtherEvent(SparkListenerEvent) - Method in class org.apache.spark.SparkFirehoseListener
onOutputOperationCompleted(JavaStreamingListenerOutputOperationCompleted) - Method in interface org.apache.spark.streaming.api.java.PythonStreamingListener: Called when processing of a job of a batch has completed.
onOutputOperationCompleted(StreamingListenerOutputOperationCompleted) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener: Called when processing of a job of a batch has completed.
onOutputOperationStarted(JavaStreamingListenerOutputOperationStarted) - Method in interface org.apache.spark.streaming.api.java.PythonStreamingListener: Called when processing of a job of a batch has started.
onOutputOperationStarted(StreamingListenerOutputOperationStarted) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener: Called when processing of a job of a batch has started.
onPushBlock(StreamBlockId, ArrayBuffer<?>) - Method in interface org.apache.spark.streaming.receiver.BlockGeneratorListener: Called when a new block is ready to be pushed.
onQueryProgress(StreamingQueryListener.QueryProgressEvent) - Method in class org.apache.spark.sql.streaming.StreamingQueryListener: Called when there is some status update (ingestion rate updated, etc.)
onQueryStarted(StreamingQueryListener.QueryStartedEvent) - Method in class org.apache.spark.sql.streaming.StreamingQueryListener: Called when a query is started.
onQueryTerminated(StreamingQueryListener.QueryTerminatedEvent) - Method in class org.apache.spark.sql.streaming.StreamingQueryListener: Called when a query is stopped, with or without error.
onReceiverError(JavaStreamingListenerReceiverError) - Method in interface org.apache.spark.streaming.api.java.PythonStreamingListener: Called when a receiver has reported an error
onReceiverError(StreamingListenerReceiverError) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener: Called when a receiver has reported an error
onReceiverStarted(JavaStreamingListenerReceiverStarted) - Method in interface org.apache.spark.streaming.api.java.PythonStreamingListener: Called when a receiver has been started
onReceiverStarted(StreamingListenerReceiverStarted) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener: Called when a receiver has been started
onReceiverStopped(JavaStreamingListenerReceiverStopped) - Method in interface org.apache.spark.streaming.api.java.PythonStreamingListener: Called when a receiver has been stopped
onReceiverStopped(StreamingListenerReceiverStopped) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener: Called when a receiver has been stopped
onSpeculativeTaskSubmitted(SparkListenerSpeculativeTaskSubmitted) - Method in class org.apache.spark.scheduler.SparkListener
onSpeculativeTaskSubmitted(SparkListenerSpeculativeTaskSubmitted) - Method in interface org.apache.spark.scheduler.SparkListenerInterface: Called when a speculative task is submitted
onSpeculativeTaskSubmitted(SparkListenerSpeculativeTaskSubmitted) - Method in class org.apache.spark.SparkFirehoseListener
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.scheduler.SparkListener
onStageCompleted(SparkListenerStageCompleted) - Method in interface org.apache.spark.scheduler.SparkListenerInterface: Called when a stage completes successfully or fails, with information on the completed stage.
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.scheduler.StatsReportListener
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.SparkFirehoseListener
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.SpillListener
onStageSubmitted(SparkListenerStageSubmitted) - Method in class org.apache.spark.scheduler.SparkListener
onStageSubmitted(SparkListenerStageSubmitted) - Method in interface org.apache.spark.scheduler.SparkListenerInterface: Called when a stage is submitted
onStageSubmitted(SparkListenerStageSubmitted) - Method in class org.apache.spark.SparkFirehoseListener
OnStart - Class in org.apache.spark.rpc.netty
OnStart() - Constructor for class org.apache.spark.rpc.netty.OnStart
onStart() - Method in interface org.apache.spark.rpc.RpcEndpoint: Invoked before RpcEndpoint starts to handle any message.
onStart() - Method in class org.apache.spark.streaming.receiver.Receiver: This method is called by the system when the receiver is started.
OnStop - Class in org.apache.spark.rpc.netty
OnStop() - Constructor for class org.apache.spark.rpc.netty.OnStop
onStop() - Method in interface org.apache.spark.rpc.RpcEndpoint: Invoked when RpcEndpoint is stopping.
onStop() - Method in class org.apache.spark.streaming.receiver.Receiver: This method is called by the system when the receiver is stopped.
onStreamingStarted(JavaStreamingListenerStreamingStarted) - Method in interface org.apache.spark.streaming.api.java.PythonStreamingListener: Called when the streaming has been started
onStreamingStarted(StreamingListenerStreamingStarted) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener: Called when the streaming has been started
onSuccess(String, QueryExecution, long) - Method in interface org.apache.spark.sql.util.QueryExecutionListener: A callback function that will be called when a query executed successfully.
onTaskCommit(FileCommitProtocol.TaskCommitMessage) - Method in class org.apache.spark.internal.io.FileCommitProtocol: Called on the driver after a task commits.
onTaskCompletion(TaskContext) - Method in interface org.apache.spark.util.TaskCompletionListener
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.scheduler.SparkListener
onTaskEnd(SparkListenerTaskEnd) - Method in interface org.apache.spark.scheduler.SparkListenerInterface: Called when a task ends
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.scheduler.StatsReportListener
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.SparkFirehoseListener
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.SpillListener
onTaskFailure(TaskContext, Throwable) - Method in interface org.apache.spark.util.TaskFailureListener
onTaskGettingResult(SparkListenerTaskGettingResult) - Method in class org.apache.spark.scheduler.SparkListener
onTaskGettingResult(SparkListenerTaskGettingResult) - Method in interface org.apache.spark.scheduler.SparkListenerInterface: Called when a task begins remotely fetching its result (will not be called for tasks that do not need to fetch the result remotely).
onTaskGettingResult(SparkListenerTaskGettingResult) - Method in class org.apache.spark.SparkFirehoseListener
onTaskStart(SparkListenerTaskStart) - Method in class org.apache.spark.scheduler.SparkListener
onTaskStart(SparkListenerTaskStart) - Method in interface org.apache.spark.scheduler.SparkListenerInterface: Called when a task starts
onTaskStart(SparkListenerTaskStart) - Method in class org.apache.spark.SparkFirehoseListener
onUnpersistRDD(SparkListenerUnpersistRDD) - Method in class org.apache.spark.scheduler.SparkListener
onUnpersistRDD(SparkListenerUnpersistRDD) - Method in interface org.apache.spark.scheduler.SparkListenerInterface: Called when an RDD is manually unpersisted by the application
onUnpersistRDD(SparkListenerUnpersistRDD) - Method in class org.apache.spark.SparkFirehoseListener
OOM() - Static method in class org.apache.spark.util.SparkExitCode: The default uncaught exception handler was reached, and the uncaught exception was an
open() - Method in class org.apache.spark.input.PortableDataStream: Create a new DataInputStream from the split and context.
open(long, long) - Method in class org.apache.spark.sql.ForeachWriter: Called when starting to process one partition of new data in the executor.
open(File, M, ClassTag<M>) - Static method in class org.apache.spark.status.KVUtils: Open or create a LevelDB store.
ops() - Method in class org.apache.spark.graphx.Graph: The associated GraphOps object.
opt(Function0<Parsers.Parser<T>>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
optimize(RDD<Tuple2<Object, Vector>>, Vector) - Method in class org.apache.spark.mllib.optimization.GradientDescent: :: DeveloperApi :: Runs gradient descent on the given training data.
optimize(RDD<Tuple2<Object, Vector>>, Vector) - Method in class org.apache.spark.mllib.optimization.LBFGS
optimize(RDD<Tuple2<Object, Vector>>, Vector) - Method in interface org.apache.spark.mllib.optimization.Optimizer: Solve the provided convex optimization problem.
optimizeDocConcentration() - Method in interface org.apache.spark.ml.clustering.LDAParams: For Online optimizer only (currently): optimizer = "online".
optimizer() - Method in interface org.apache.spark.ml.clustering.LDAParams: Optimizer or inference algorithm used to estimate the LDA model.
optimizer() - Method in class org.apache.spark.mllib.classification.LogisticRegressionWithLBFGS
optimizer() - Method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
optimizer() - Method in class org.apache.spark.mllib.classification.SVMWithSGD
Optimizer - Interface in org.apache.spark.mllib.optimization: :: DeveloperApi :: Trait for optimization problem solvers.
optimizer() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm: The optimizer to solve the problem.
optimizer() - Method in class org.apache.spark.mllib.regression.LassoWithSGD
optimizer() - Method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
optimizer() - Method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
option(String, String) - Method in class org.apache.spark.ml.util.MLWriter: Adds an option to the underlying MLWriter.
option(String, String) - Method in class org.apache.spark.sql.DataFrameReader: Adds an input option for the underlying data source.
option(String, boolean) - Method in class org.apache.spark.sql.DataFrameReader: Adds an input option for the underlying data source.
option(String, long) - Method in class org.apache.spark.sql.DataFrameReader: Adds an input option for the underlying data source.
option(String, double) - Method in class org.apache.spark.sql.DataFrameReader: Adds an input option for the underlying data source.
option(String, String) - Method in class org.apache.spark.sql.DataFrameWriter: Adds an output option for the underlying data source.
option(String, boolean) - Method in class org.apache.spark.sql.DataFrameWriter: Adds an output option for the underlying data source.
option(String, long) - Method in class org.apache.spark.sql.DataFrameWriter: Adds an output option for the underlying data source.
option(String, double) - Method in class org.apache.spark.sql.DataFrameWriter: Adds an output option for the underlying data source.
option(String, String) - Method in class org.apache.spark.sql.streaming.DataStreamReader: Adds an input option for the underlying data source.
option(String, boolean) - Method in class org.apache.spark.sql.streaming.DataStreamReader: Adds an input option for the underlying data source.
option(String, long) - Method in class org.apache.spark.sql.streaming.DataStreamReader: Adds an input option for the underlying data source.
option(String, double) - Method in class org.apache.spark.sql.streaming.DataStreamReader: Adds an input option for the underlying data source.
option(String, String) - Method in class org.apache.spark.sql.streaming.DataStreamWriter: Adds an output option for the underlying data source.
option(String, boolean) - Method in class org.apache.spark.sql.streaming.DataStreamWriter: Adds an output option for the underlying data source.
option(String, long) - Method in class org.apache.spark.sql.streaming.DataStreamWriter: Adds an output option for the underlying data source.
option(String, double) - Method in class org.apache.spark.sql.streaming.DataStreamWriter: Adds an output option for the underlying data source.
Optional<T> - Class in org.apache.spark.api.java: Like java.util.Optional in Java 8, scala.Option in Scala, and com.google.common.base.Optional in Google Guava, this class represents a value of a given type that may or may not exist.
options(Map<String, String>) - Method in class org.apache.spark.sql.DataFrameReader: (Scala-specific) Adds input options for the underlying data source.
options(Map<String, String>) - Method in class org.apache.spark.sql.DataFrameReader: Adds input options for the underlying data source.
options(Map<String, String>) - Method in class org.apache.spark.sql.DataFrameWriter: (Scala-specific) Adds output options for the underlying data source.
options(Map<String, String>) - Method in class org.apache.spark.sql.DataFrameWriter: Adds output options for the underlying data source.
options(Map<String, String>) - Method in class org.apache.spark.sql.streaming.DataStreamReader: (Scala-specific) Adds input options for the underlying data source.
options(Map<String, String>) - Method in class org.apache.spark.sql.streaming.DataStreamReader: (Java-specific) Adds input options for the underlying data source.
options(Map<String, String>) - Method in class org.apache.spark.sql.streaming.DataStreamWriter: (Scala-specific) Adds output options for the underlying data source.
options(Map<String, String>) - Method in class org.apache.spark.sql.streaming.DataStreamWriter: Adds output options for the underlying data source.
optionSparkSession() - Method in interface org.apache.spark.ml.util.BaseReadWrite
optionToOptional(Option<T>) - Static method in class org.apache.spark.api.java.JavaUtils
or(T) - Method in class org.apache.spark.api.java.Optional
or(Column) - Method in class org.apache.spark.sql.Column: Boolean OR.
Or - Class in org.apache.spark.sql.sources: A filter that evaluates to true iff at least one of left or right evaluates to true.
Or(Filter, Filter) - Constructor for class org.apache.spark.sql.sources.Or
OracleDialect - Class in org.apache.spark.sql.jdbc
OracleDialect() - Constructor for class org.apache.spark.sql.jdbc.OracleDialect
orc(String...) - Method in class org.apache.spark.sql.DataFrameReader: Loads ORC files and returns the result as a DataFrame.
orc(String) - Method in class org.apache.spark.sql.DataFrameReader: Loads an ORC file and returns the result as a DataFrame.
orc(Seq<String>) - Method in class org.apache.spark.sql.DataFrameReader: Loads ORC files and returns the result as a DataFrame.
orc(String) - Method in class org.apache.spark.sql.DataFrameWriter: Saves the content of the DataFrame in ORC format at the specified path.
orc(String) - Method in class org.apache.spark.sql.streaming.DataStreamReader: Loads a ORC file stream, returning the result as a DataFrame.
OrcFileFormat - Class in org.apache.spark.sql.hive.orc: FileFormat for reading ORC files.
OrcFileFormat() - Constructor for class org.apache.spark.sql.hive.orc.OrcFileFormat
OrcFileOperator - Class in org.apache.spark.sql.hive.orc
OrcFileOperator() - Constructor for class org.apache.spark.sql.hive.orc.OrcFileOperator
OrcFilters - Class in org.apache.spark.sql.hive.orc: Helper object for building ORC SearchArguments, which are used for ORC predicate push-down.
OrcFilters() - Constructor for class org.apache.spark.sql.hive.orc.OrcFilters
orderBy(String, String...) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset sorted by the given expressions.
orderBy(Column...) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset sorted by the given expressions.
orderBy(String, Seq<String>) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset sorted by the given expressions.
orderBy(Seq<Column>) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset sorted by the given expressions.
orderBy(String, String...) - Static method in class org.apache.spark.sql.expressions.Window: Creates a WindowSpec with the ordering defined.
orderBy(Column...) - Static method in class org.apache.spark.sql.expressions.Window: Creates a WindowSpec with the ordering defined.
orderBy(String, Seq<String>) - Static method in class org.apache.spark.sql.expressions.Window: Creates a WindowSpec with the ordering defined.
orderBy(Seq<Column>) - Static method in class org.apache.spark.sql.expressions.Window: Creates a WindowSpec with the ordering defined.
orderBy(String, String...) - Method in class org.apache.spark.sql.expressions.WindowSpec: Defines the ordering columns in a WindowSpec.
orderBy(Column...) - Method in class org.apache.spark.sql.expressions.WindowSpec: Defines the ordering columns in a WindowSpec.
orderBy(String, Seq<String>) - Method in class org.apache.spark.sql.expressions.WindowSpec: Defines the ordering columns in a WindowSpec.
orderBy(Seq<Column>) - Method in class org.apache.spark.sql.expressions.WindowSpec: Defines the ordering columns in a WindowSpec.
OrderedRDDFunctions<K,V,P extends scala.Product2<K,V>> - Class in org.apache.spark.rdd: Extra functions available on RDDs of (key, value) pairs where the key is sortable through an implicit conversion.
OrderedRDDFunctions(RDD, Ordering<K>, ClassTag<K>, ClassTag<V>, ClassTag) - Constructor for class org.apache.spark.rdd.OrderedRDDFunctions
ordering() - Static method in class org.apache.spark.streaming.Time
ORDINAL() - Static method in class org.apache.spark.ml.attribute.AttributeKeys
orElse(T) - Method in class org.apache.spark.api.java.Optional
org.apache.spark - package org.apache.spark: Core Spark classes in Scala.
org.apache.spark.api.java - package org.apache.spark.api.java: Spark Java programming APIs.
org.apache.spark.api.java.function - package org.apache.spark.api.java.function: Set of interfaces to represent functions in Spark's Java API.
org.apache.spark.api.r - package org.apache.spark.api.r
org.apache.spark.broadcast - package org.apache.spark.broadcast: Spark's broadcast variables, used to broadcast immutable datasets to all nodes.
org.apache.spark.graphx - package org.apache.spark.graphx: ALPHA COMPONENT GraphX is a graph processing framework built on top of Spark.
org.apache.spark.graphx.impl - package org.apache.spark.graphx.impl
org.apache.spark.graphx.lib - package org.apache.spark.graphx.lib: Various analytics functions for graphs.
org.apache.spark.graphx.util - package org.apache.spark.graphx.util: Collections of utilities used by graphx.
org.apache.spark.input - package org.apache.spark.input
org.apache.spark.internal - package org.apache.spark.internal
org.apache.spark.internal.config - package org.apache.spark.internal.config
org.apache.spark.internal.io - package org.apache.spark.internal.io
org.apache.spark.io - package org.apache.spark.io: IO codecs used for compression.
org.apache.spark.launcher - package org.apache.spark.launcher: Library for launching Spark applications programmatically.
org.apache.spark.mapred - package org.apache.spark.mapred
org.apache.spark.metrics.sink - package org.apache.spark.metrics.sink
org.apache.spark.metrics.source - package org.apache.spark.metrics.source
org.apache.spark.ml - package org.apache.spark.ml: DataFrame-based machine learning APIs to let users quickly assemble and configure practical machine learning pipelines.
org.apache.spark.ml.ann - package org.apache.spark.ml.ann
org.apache.spark.ml.attribute - package org.apache.spark.ml.attribute: ML attributes
org.apache.spark.ml.classification - package org.apache.spark.ml.classification
org.apache.spark.ml.clustering - package org.apache.spark.ml.clustering
org.apache.spark.ml.evaluation - package org.apache.spark.ml.evaluation
org.apache.spark.ml.feature - package org.apache.spark.ml.feature: Feature transformers The `ml.feature` package provides common feature transformers that help convert raw data or features into more suitable forms for model fitting.
org.apache.spark.ml.fpm - package org.apache.spark.ml.fpm
org.apache.spark.ml.image - package org.apache.spark.ml.image
org.apache.spark.ml.impl - package org.apache.spark.ml.impl
org.apache.spark.ml.linalg - package org.apache.spark.ml.linalg
org.apache.spark.ml.optim - package org.apache.spark.ml.optim
org.apache.spark.ml.optim.aggregator - package org.apache.spark.ml.optim.aggregator
org.apache.spark.ml.optim.loss - package org.apache.spark.ml.optim.loss
org.apache.spark.ml.param - package org.apache.spark.ml.param
org.apache.spark.ml.param.shared - package org.apache.spark.ml.param.shared
org.apache.spark.ml.r - package org.apache.spark.ml.r
org.apache.spark.ml.recommendation - package org.apache.spark.ml.recommendation
org.apache.spark.ml.regression - package org.apache.spark.ml.regression
org.apache.spark.ml.source.image - package org.apache.spark.ml.source.image
org.apache.spark.ml.source.libsvm - package org.apache.spark.ml.source.libsvm
org.apache.spark.ml.stat - package org.apache.spark.ml.stat
org.apache.spark.ml.stat.distribution - package org.apache.spark.ml.stat.distribution
org.apache.spark.ml.tree - package org.apache.spark.ml.tree
org.apache.spark.ml.tree.impl - package org.apache.spark.ml.tree.impl
org.apache.spark.ml.tuning - package org.apache.spark.ml.tuning
org.apache.spark.ml.util - package org.apache.spark.ml.util
org.apache.spark.mllib - package org.apache.spark.mllib: RDD-based machine learning APIs (in maintenance mode).
org.apache.spark.mllib.classification - package org.apache.spark.mllib.classification
org.apache.spark.mllib.classification.impl - package org.apache.spark.mllib.classification.impl
org.apache.spark.mllib.clustering - package org.apache.spark.mllib.clustering
org.apache.spark.mllib.evaluation - package org.apache.spark.mllib.evaluation
org.apache.spark.mllib.evaluation.binary - package org.apache.spark.mllib.evaluation.binary
org.apache.spark.mllib.feature - package org.apache.spark.mllib.feature
org.apache.spark.mllib.fpm - package org.apache.spark.mllib.fpm
org.apache.spark.mllib.linalg - package org.apache.spark.mllib.linalg
org.apache.spark.mllib.linalg.distributed - package org.apache.spark.mllib.linalg.distributed
org.apache.spark.mllib.optimization - package org.apache.spark.mllib.optimization
org.apache.spark.mllib.pmml - package org.apache.spark.mllib.pmml
org.apache.spark.mllib.pmml.export - package org.apache.spark.mllib.pmml.export
org.apache.spark.mllib.random - package org.apache.spark.mllib.random
org.apache.spark.mllib.rdd - package org.apache.spark.mllib.rdd
org.apache.spark.mllib.recommendation - package org.apache.spark.mllib.recommendation
org.apache.spark.mllib.regression - package org.apache.spark.mllib.regression
org.apache.spark.mllib.regression.impl - package org.apache.spark.mllib.regression.impl
org.apache.spark.mllib.stat - package org.apache.spark.mllib.stat
org.apache.spark.mllib.stat.correlation - package org.apache.spark.mllib.stat.correlation
org.apache.spark.mllib.stat.distribution - package org.apache.spark.mllib.stat.distribution
org.apache.spark.mllib.stat.test - package org.apache.spark.mllib.stat.test
org.apache.spark.mllib.tree - package org.apache.spark.mllib.tree
org.apache.spark.mllib.tree.configuration - package org.apache.spark.mllib.tree.configuration
org.apache.spark.mllib.tree.impurity - package org.apache.spark.mllib.tree.impurity
org.apache.spark.mllib.tree.loss - package org.apache.spark.mllib.tree.loss
org.apache.spark.mllib.tree.model - package org.apache.spark.mllib.tree.model
org.apache.spark.mllib.util - package org.apache.spark.mllib.util
org.apache.spark.partial - package org.apache.spark.partial
org.apache.spark.rdd - package org.apache.spark.rdd: Provides implementation's of various RDDs.
org.apache.spark.rpc - package org.apache.spark.rpc
org.apache.spark.rpc.netty - package org.apache.spark.rpc.netty
org.apache.spark.scheduler - package org.apache.spark.scheduler: Spark's DAG scheduler.
org.apache.spark.scheduler.cluster - package org.apache.spark.scheduler.cluster
org.apache.spark.scheduler.local - package org.apache.spark.scheduler.local
org.apache.spark.security - package org.apache.spark.security
org.apache.spark.serializer - package org.apache.spark.serializer: Pluggable serializers for RDD and shuffle data.
org.apache.spark.sql - package org.apache.spark.sql
org.apache.spark.sql.api.java - package org.apache.spark.sql.api.java: Allows the execution of relational queries, including those expressed in SQL using Spark.
org.apache.spark.sql.api.r - package org.apache.spark.sql.api.r
org.apache.spark.sql.catalog - package org.apache.spark.sql.catalog
org.apache.spark.sql.expressions - package org.apache.spark.sql.expressions
org.apache.spark.sql.expressions.javalang - package org.apache.spark.sql.expressions.javalang
org.apache.spark.sql.expressions.scalalang - package org.apache.spark.sql.expressions.scalalang
org.apache.spark.sql.hive - package org.apache.spark.sql.hive
org.apache.spark.sql.hive.client - package org.apache.spark.sql.hive.client
org.apache.spark.sql.hive.execution - package org.apache.spark.sql.hive.execution
org.apache.spark.sql.hive.orc - package org.apache.spark.sql.hive.orc
org.apache.spark.sql.jdbc - package org.apache.spark.sql.jdbc
org.apache.spark.sql.sources - package org.apache.spark.sql.sources
org.apache.spark.sql.sources.v2 - package org.apache.spark.sql.sources.v2
org.apache.spark.sql.sources.v2.reader - package org.apache.spark.sql.sources.v2.reader
org.apache.spark.sql.sources.v2.reader.partitioning - package org.apache.spark.sql.sources.v2.reader.partitioning
org.apache.spark.sql.sources.v2.reader.streaming - package org.apache.spark.sql.sources.v2.reader.streaming
org.apache.spark.sql.sources.v2.writer - package org.apache.spark.sql.sources.v2.writer
org.apache.spark.sql.sources.v2.writer.streaming - package org.apache.spark.sql.sources.v2.writer.streaming
org.apache.spark.sql.streaming - package org.apache.spark.sql.streaming
org.apache.spark.sql.types - package org.apache.spark.sql.types
org.apache.spark.sql.util - package org.apache.spark.sql.util
org.apache.spark.sql.vectorized - package org.apache.spark.sql.vectorized
org.apache.spark.status - package org.apache.spark.status
org.apache.spark.status.api.v1 - package org.apache.spark.status.api.v1
org.apache.spark.status.api.v1.streaming - package org.apache.spark.status.api.v1.streaming
org.apache.spark.storage - package org.apache.spark.storage
org.apache.spark.storage.memory - package org.apache.spark.storage.memory
org.apache.spark.streaming - package org.apache.spark.streaming
org.apache.spark.streaming.api.java - package org.apache.spark.streaming.api.java: Java APIs for spark streaming.
org.apache.spark.streaming.dstream - package org.apache.spark.streaming.dstream: Various implementations of DStreams.
org.apache.spark.streaming.kinesis - package org.apache.spark.streaming.kinesis
org.apache.spark.streaming.receiver - package org.apache.spark.streaming.receiver
org.apache.spark.streaming.scheduler - package org.apache.spark.streaming.scheduler
org.apache.spark.streaming.scheduler.rate - package org.apache.spark.streaming.scheduler.rate
org.apache.spark.streaming.ui - package org.apache.spark.streaming.ui
org.apache.spark.streaming.util - package org.apache.spark.streaming.util
org.apache.spark.ui - package org.apache.spark.ui
org.apache.spark.ui.jobs - package org.apache.spark.ui.jobs
org.apache.spark.ui.storage - package org.apache.spark.ui.storage
org.apache.spark.util - package org.apache.spark.util: Spark utilities.
org.apache.spark.util.logging - package org.apache.spark.util.logging
org.apache.spark.util.random - package org.apache.spark.util.random: Utilities for random number generation.
org.apache.spark.util.sketch - package org.apache.spark.util.sketch
original() - Method in interface org.apache.spark.security.CryptoStreamUtils.BaseErrorHandler: The underlying stream that is being wrapped by the encrypted stream, so that it can be closed even if there's an error in the crypto layer.
originalMax() - Method in class org.apache.spark.ml.feature.MinMaxScalerModel
originalMin() - Method in class org.apache.spark.ml.feature.MinMaxScalerModel
orNull() - Method in class org.apache.spark.api.java.Optional
other() - Method in class org.apache.spark.scheduler.RuntimePercentage
otherVertexAttr(long) - Method in class org.apache.spark.graphx.EdgeTriplet: Given one vertex in the edge return the other vertex.
otherVertexId(long) - Method in class org.apache.spark.graphx.Edge: Given one vertex in the edge return the other vertex.
otherwise(Object) - Method in class org.apache.spark.sql.Column: Evaluates a list of conditions and returns one of multiple possible result expressions.
Out() - Static method in class org.apache.spark.graphx.EdgeDirection: Edges originating from a vertex.
OutboxMessage - Interface in org.apache.spark.rpc.netty
outDegrees() - Method in class org.apache.spark.graphx.GraphOps: The out-degree of each vertex in the graph.
outerJoinVertices(RDD<Tuple2<Object, U>>, Function3<Object, VD, Option, VD2>, ClassTag, ClassTag<VD2>, Predef.$eq$colon$eq<VD, VD2>) - Method in class org.apache.spark.graphx.Graph: Joins the vertices with entries in the table RDD and merges the results using mapFunc.
outerJoinVertices(RDD<Tuple2<Object, U>>, Function3<Object, VD, Option, VD2>, ClassTag, ClassTag<VD2>, Predef.$eq$colon$eq<VD, VD2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
output() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformationExec
OUTPUT() - Static method in class org.apache.spark.ui.ToolTips
output$() - Constructor for class org.apache.spark.InternalAccumulator.output$
OUTPUT_FORMAT() - Static method in class org.apache.spark.sql.hive.execution.HiveOptions
OUTPUT_METRICS_PREFIX() - Static method in class org.apache.spark.InternalAccumulator
OUTPUT_RECORDS() - Static method in class org.apache.spark.status.TaskIndexNames
OUTPUT_SIZE() - Static method in class org.apache.spark.status.TaskIndexNames
outputBytes() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
outputBytes() - Method in class org.apache.spark.status.api.v1.StageData
outputCol() - Method in interface org.apache.spark.ml.param.shared.HasOutputCol: Param for output column name.
outputCols() - Method in interface org.apache.spark.ml.param.shared.HasOutputCols: Param for output column names.
outputColumnNames() - Method in class org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand
outputColumnNames() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveDirCommand
outputColumnNames() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
OutputCommitCoordinationMessage - Interface in org.apache.spark.scheduler
outputCommitCoordinator() - Method in class org.apache.spark.SparkEnv
outputEncoder() - Method in class org.apache.spark.sql.expressions.Aggregator: Specifies the Encoder for the final output value type.
outputFormat() - Method in class org.apache.spark.sql.hive.execution.HiveOptions
OutputMetricDistributions - Class in org.apache.spark.status.api.v1
OutputMetrics - Class in org.apache.spark.status.api.v1
outputMetrics() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
outputMetrics() - Method in class org.apache.spark.status.api.v1.TaskMetrics
outputMode(OutputMode) - Method in class org.apache.spark.sql.streaming.DataStreamWriter: Specifies how data of a streaming DataFrame/Dataset is written to a streaming sink.
outputMode(String) - Method in class org.apache.spark.sql.streaming.DataStreamWriter: Specifies how data of a streaming DataFrame/Dataset is written to a streaming sink.
OutputMode - Class in org.apache.spark.sql.streaming: OutputMode describes what data will be written to a streaming sink when there is new data available in a streaming DataFrame/Dataset.
OutputMode() - Constructor for class org.apache.spark.sql.streaming.OutputMode
OutputOperationInfo - Class in org.apache.spark.status.api.v1.streaming
OutputOperationInfo - Class in org.apache.spark.streaming.scheduler: :: DeveloperApi :: Class having information on output operations.
OutputOperationInfo(Time, int, String, String, Option<Object>, Option<Object>, Option<String>) - Constructor for class org.apache.spark.streaming.scheduler.OutputOperationInfo
outputOperationInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerOutputOperationCompleted
outputOperationInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerOutputOperationStarted
outputOperationInfos() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
outputOpId() - Method in class org.apache.spark.status.api.v1.streaming.OutputOperationInfo
outputPartitioning() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformationExec
outputPartitioning() - Method in interface org.apache.spark.sql.sources.v2.reader.SupportsReportPartitioning: Returns the output data partitioning that this reader guarantees.
outputRecords() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
outputRecords() - Method in class org.apache.spark.status.api.v1.StageData
outputRowFormat() - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
outputRowFormatMap() - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
outputSerdeClass() - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
outputSerdeProps() - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
over(WindowSpec) - Method in class org.apache.spark.sql.Column: Defines a windowing column.
over() - Method in class org.apache.spark.sql.Column: Defines an empty analytic clause.
overallScore(Dataset<Row>, Column) - Static method in class org.apache.spark.ml.evaluation.CosineSilhouette
overallScore(Dataset<Row>, Column) - Static method in class org.apache.spark.ml.evaluation.SquaredEuclideanSilhouette
overwrite() - Method in class org.apache.spark.ml.util.MLWriter: Overwrites if the output path already exists.
overwrite() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveDirCommand
overwrite() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable

P

p() - Method in class org.apache.spark.ml.feature.Normalizer: Normalization in L^p^ space.
PagedTable<T> - Interface in org.apache.spark.ui: A paged table that will generate a HTML table for a specified page and also the page navigation.
pageLink(int) - Method in interface org.apache.spark.ui.PagedTable: Return a link to jump to a page.
pageNavigation(int, int, int) - Method in interface org.apache.spark.ui.PagedTable: Return a page navigation.
pageNumberFormField() - Method in interface org.apache.spark.ui.PagedTable
pageRank(double, double) - Method in class org.apache.spark.graphx.GraphOps: Run a dynamic version of PageRank returning a graph with vertex attributes containing the PageRank and edge attributes containing the normalized edge weight.
PageRank - Class in org.apache.spark.graphx.lib: PageRank algorithm implementation.
PageRank() - Constructor for class org.apache.spark.graphx.lib.PageRank
pageSizeFormField() - Method in interface org.apache.spark.ui.PagedTable
PairDStreamFunctions<K,V> - Class in org.apache.spark.streaming.dstream: Extra functions available on DStream of (key, value) pairs through an implicit conversion.
PairDStreamFunctions(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Constructor for class org.apache.spark.streaming.dstream.PairDStreamFunctions
PairFlatMapFunction<T,K,V> - Interface in org.apache.spark.api.java.function: A function that returns zero or more key-value pair records from each input record.
PairFunction<T,K,V> - Interface in org.apache.spark.api.java.function: A function that returns key-value pairs (Tuple2<K, V>), and can be used to construct PairRDDs.
PairRDDFunctions<K,V> - Class in org.apache.spark.rdd: Extra functions available on RDDs of (key, value) pairs through an implicit conversion.
PairRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Constructor for class org.apache.spark.rdd.PairRDDFunctions
PairwiseRRDD<T> - Class in org.apache.spark.api.r: Form an RDD[(Int, Array[Byte])] from key-value pairs returned from R.
PairwiseRRDD(RDD<T>, int, byte[], String, byte[], Object[], ClassTag<T>) - Constructor for class org.apache.spark.api.r.PairwiseRRDD
parallelism() - Method in interface org.apache.spark.ml.param.shared.HasParallelism: The number of threads to use when running parallel algorithms.
parallelize(List<T>, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Distribute a local Scala collection to form an RDD.
parallelize(List<T>) - Method in class org.apache.spark.api.java.JavaSparkContext: Distribute a local Scala collection to form an RDD.
parallelize(Seq<T>, int, ClassTag<T>) - Method in class org.apache.spark.SparkContext: Distribute a local Scala collection to form an RDD.
parallelizeDoubles(List<Double>, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Distribute a local Scala collection to form an RDD.
parallelizeDoubles(List<Double>) - Method in class org.apache.spark.api.java.JavaSparkContext: Distribute a local Scala collection to form an RDD.
parallelizePairs(List<Tuple2<K, V>>, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Distribute a local Scala collection to form an RDD.
parallelizePairs(List<Tuple2<K, V>>) - Method in class org.apache.spark.api.java.JavaSparkContext: Distribute a local Scala collection to form an RDD.
Param<T> - Class in org.apache.spark.ml.param: :: DeveloperApi :: A param with self-contained documentation and optionally default value.
Param(String, String, String, Function1<T, Object>) - Constructor for class org.apache.spark.ml.param.Param
Param(Identifiable, String, String, Function1<T, Object>) - Constructor for class org.apache.spark.ml.param.Param
Param(String, String, String) - Constructor for class org.apache.spark.ml.param.Param
Param(Identifiable, String, String) - Constructor for class org.apache.spark.ml.param.Param
param() - Method in class org.apache.spark.ml.param.ParamPair
ParamGridBuilder - Class in org.apache.spark.ml.tuning: Builder for a param grid used in grid search-based model selection.
ParamGridBuilder() - Constructor for class org.apache.spark.ml.tuning.ParamGridBuilder
ParamMap - Class in org.apache.spark.ml.param: A param to value map.
ParamMap() - Constructor for class org.apache.spark.ml.param.ParamMap: Creates an empty param map.
paramMap() - Method in interface org.apache.spark.ml.param.Params: Internal param map for user-supplied values.
ParamPair<T> - Class in org.apache.spark.ml.param: A param and its value.
ParamPair(Param<T>, T) - Constructor for class org.apache.spark.ml.param.ParamPair
Params - Interface in org.apache.spark.ml.param: :: DeveloperApi :: Trait for components that take parameters.
params() - Method in interface org.apache.spark.ml.param.Params: Returns all params sorted by their names.
ParamValidators - Class in org.apache.spark.ml.param: :: DeveloperApi :: Factory methods for common validation functions for Param.isValid.
ParamValidators() - Constructor for class org.apache.spark.ml.param.ParamValidators
parent() - Method in class org.apache.spark.ml.Model: The parent estimator that produced this model.
parent() - Method in class org.apache.spark.ml.param.Param
parent() - Method in interface org.apache.spark.scheduler.Schedulable
ParentClassLoader - Class in org.apache.spark.util: A class loader which makes some protected methods in ClassLoader accessible.
ParentClassLoader(ClassLoader) - Constructor for class org.apache.spark.util.ParentClassLoader
parentIds() - Method in class org.apache.spark.scheduler.StageInfo
parentIds() - Method in class org.apache.spark.storage.RDDInfo
parentIndex(int) - Static method in class org.apache.spark.mllib.tree.model.Node: Get the parent index of the given node, or 0 if it is the root.
parmap(Col, String, int, Function1<I, O>, CanBuildFrom<Col, Future<O>, Col>, CanBuildFrom<Col, O, Col>) - Static method in class org.apache.spark.util.ThreadUtils: Transforms input collection by applying the given function to each element in parallel fashion.
parquet(String...) - Method in class org.apache.spark.sql.DataFrameReader: Loads a Parquet file, returning the result as a DataFrame.
parquet(String) - Method in class org.apache.spark.sql.DataFrameReader: Loads a Parquet file, returning the result as a DataFrame.
parquet(Seq<String>) - Method in class org.apache.spark.sql.DataFrameReader: Loads a Parquet file, returning the result as a DataFrame.
parquet(String) - Method in class org.apache.spark.sql.DataFrameWriter: Saves the content of the DataFrame in Parquet format at the specified path.
parquet(String) - Method in class org.apache.spark.sql.streaming.DataStreamReader: Loads a Parquet file stream, returning the result as a DataFrame.
parquetFile(String...) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
As of 1.4.0, replaced by read().parquet().
parquetFile(Seq<String>) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
Use read.parquet() instead. Since 1.4.0.
parse(String) - Static method in class org.apache.spark.ml.feature.RFormulaParser
parse(String) - Static method in class org.apache.spark.mllib.linalg.Vectors: Parses a string resulted from Vector.toString into a Vector.
parse(String) - Static method in class org.apache.spark.mllib.regression.LabeledPoint: Parses a string resulted from LabeledPoint#toString into an LabeledPoint.
parse(String) - Static method in class org.apache.spark.mllib.util.NumericParser: Parses a string into a Double, an Array[Double], or a Seq[Any].
parseAll(Parsers.Parser<T>, Reader<Object>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
parseAll(Parsers.Parser<T>, Reader) - Static method in class org.apache.spark.ml.feature.RFormulaParser
parseAll(Parsers.Parser<T>, CharSequence) - Static method in class org.apache.spark.ml.feature.RFormulaParser
parseHostPort(String) - Static method in class org.apache.spark.util.Utils
parseIgnoreCase(Class<E>, String) - Static method in class org.apache.spark.util.EnumUtil
Parser(Function1<Reader<Object>, Parsers.ParseResult<T>>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
parseStandaloneMasterUrls(String) - Static method in class org.apache.spark.util.Utils: Split the comma delimited string of master URLs into a list.
PartialResult<R> - Class in org.apache.spark.partial
PartialResult(R, boolean) - Constructor for class org.apache.spark.partial.PartialResult
Partition - Interface in org.apache.spark: An identifier for a partition in an RDD.
partition() - Method in class org.apache.spark.scheduler.AskPermissionToCommitOutput
partition() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
partition(String) - Method in class org.apache.spark.status.LiveRDD
partitionBy(Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a copy of the RDD partitioned using the specified partitioner.
partitionBy(PartitionStrategy) - Method in class org.apache.spark.graphx.Graph: Repartitions the edges in the graph according to partitionStrategy.
partitionBy(PartitionStrategy, int) - Method in class org.apache.spark.graphx.Graph: Repartitions the edges in the graph according to partitionStrategy.
partitionBy(PartitionStrategy) - Method in class org.apache.spark.graphx.impl.GraphImpl
partitionBy(PartitionStrategy, int) - Method in class org.apache.spark.graphx.impl.GraphImpl
partitionBy(Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return a copy of the RDD partitioned using the specified partitioner.
partitionBy(String...) - Method in class org.apache.spark.sql.DataFrameWriter: Partitions the output by the given columns on the file system.
partitionBy(Seq<String>) - Method in class org.apache.spark.sql.DataFrameWriter: Partitions the output by the given columns on the file system.
partitionBy(String, String...) - Static method in class org.apache.spark.sql.expressions.Window: Creates a WindowSpec with the partitioning defined.
partitionBy(Column...) - Static method in class org.apache.spark.sql.expressions.Window: Creates a WindowSpec with the partitioning defined.
partitionBy(String, Seq<String>) - Static method in class org.apache.spark.sql.expressions.Window: Creates a WindowSpec with the partitioning defined.
partitionBy(Seq<Column>) - Static method in class org.apache.spark.sql.expressions.Window: Creates a WindowSpec with the partitioning defined.
partitionBy(String, String...) - Method in class org.apache.spark.sql.expressions.WindowSpec: Defines the partitioning columns in a WindowSpec.
partitionBy(Column...) - Method in class org.apache.spark.sql.expressions.WindowSpec: Defines the partitioning columns in a WindowSpec.
partitionBy(String, Seq<String>) - Method in class org.apache.spark.sql.expressions.WindowSpec: Defines the partitioning columns in a WindowSpec.
partitionBy(Seq<Column>) - Method in class org.apache.spark.sql.expressions.WindowSpec: Defines the partitioning columns in a WindowSpec.
partitionBy(String...) - Method in class org.apache.spark.sql.streaming.DataStreamWriter: Partitions the output by the given columns on the file system.
partitionBy(Seq<String>) - Method in class org.apache.spark.sql.streaming.DataStreamWriter: Partitions the output by the given columns on the file system.
PartitionCoalescer - Interface in org.apache.spark.rdd: ::DeveloperApi:: A PartitionCoalescer defines how to coalesce the partitions of a given RDD.
partitioner() - Method in interface org.apache.spark.api.java.JavaRDDLike: The partitioner of this RDD.
partitioner() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl: If partitionsRDD already has a partitioner, use it.
partitioner() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
Partitioner - Class in org.apache.spark: An object that defines how the elements in a key-value pair RDD are partitioned by key.
Partitioner() - Constructor for class org.apache.spark.Partitioner
partitioner() - Method in class org.apache.spark.rdd.CoGroupedRDD
partitioner() - Method in class org.apache.spark.rdd.RDD: Optionally overridden by subclasses to specify how they are partitioned.
partitioner() - Method in class org.apache.spark.rdd.ShuffledRDD
partitioner() - Method in class org.apache.spark.ShuffleDependency
partitioner(Partitioner) - Method in class org.apache.spark.streaming.StateSpec: Set the partitioner by which the state RDDs generated by mapWithState will be partitioned.
PartitionGroup - Class in org.apache.spark.rdd: ::DeveloperApi:: A group of Partitions param: prefLoc preferred location for the partition group
PartitionGroup(Option<String>) - Constructor for class org.apache.spark.rdd.PartitionGroup
partitionId() - Method in class org.apache.spark.BarrierTaskContext
partitionID() - Method in class org.apache.spark.TaskCommitDenied
partitionId() - Method in class org.apache.spark.TaskContext: The ID of the RDD partition that is computed by this task.
Partitioning - Interface in org.apache.spark.sql.sources.v2.reader.partitioning: An interface to represent the output data partitioning for a data source, which is returned by SupportsReportPartitioning.outputPartitioning().
PartitionLocations(RDD<?>) - Constructor for class org.apache.spark.rdd.DefaultPartitionCoalescer.PartitionLocations
PartitionOffset - Interface in org.apache.spark.sql.sources.v2.reader.streaming: Used for per-partition offsets in continuous processing.
PartitionPruningRDD<T> - Class in org.apache.spark.rdd: :: DeveloperApi :: An RDD used to prune RDD partitions/partitions so we can avoid launching tasks on all partitions.
PartitionPruningRDD(RDD<T>, Function1<Object, Object>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.PartitionPruningRDD
partitions() - Method in interface org.apache.spark.api.java.JavaRDDLike: Set of partitions in this RDD.
partitions() - Method in class org.apache.spark.rdd.PartitionGroup
partitions() - Method in class org.apache.spark.rdd.RDD: Get the array of partitions of this RDD, taking into account whether the RDD is checkpointed or not.
partitions() - Method in class org.apache.spark.status.api.v1.RDDStorageInfo
partitionsRDD() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
partitionsRDD() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
PartitionStrategy - Interface in org.apache.spark.graphx: Represents the way edges are assigned to edge partitions based on their source and destination vertex IDs.
PartitionStrategy.CanonicalRandomVertexCut$ - Class in org.apache.spark.graphx: Assigns edges to partitions by hashing the source and destination vertex IDs in a canonical direction, resulting in a random vertex cut that colocates all edges between two vertices, regardless of direction.
PartitionStrategy.EdgePartition1D$ - Class in org.apache.spark.graphx: Assigns edges to partitions using only the source vertex ID, colocating edges with the same source.
PartitionStrategy.EdgePartition2D$ - Class in org.apache.spark.graphx: Assigns edges to partitions using a 2D partitioning of the sparse edge adjacency matrix, guaranteeing a 2 * sqrt(numParts) bound on vertex replication.
PartitionStrategy.RandomVertexCut$ - Class in org.apache.spark.graphx: Assigns edges to partitions by hashing the source and destination vertex IDs, resulting in a random vertex cut that colocates all same-direction edges between two vertices.
partsWithLocs() - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer.PartitionLocations
partsWithoutLocs() - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer.PartitionLocations
path() - Method in class org.apache.spark.scheduler.InputFormatInfo
path() - Method in class org.apache.spark.scheduler.SplitInfo
PATH_KEY - Static variable in class org.apache.spark.sql.sources.v2.DataSourceOptions: The option key for singular path.
paths() - Method in class org.apache.spark.sql.sources.v2.DataSourceOptions: Returns all the paths specified by both the singular path option and the multiple paths option.
PATHS_KEY - Static variable in class org.apache.spark.sql.sources.v2.DataSourceOptions: The option key for multiple paths.
pattern() - Method in class org.apache.spark.ml.feature.RegexTokenizer: Regex pattern used to match delimiters if gaps is true or tokens if gaps is false.
pc() - Method in class org.apache.spark.ml.feature.PCAModel
pc() - Method in class org.apache.spark.mllib.feature.PCAModel
PCA - Class in org.apache.spark.ml.feature: PCA trains a model to project vectors to a lower dimensional space of the top PCA!.k principal components.
PCA(String) - Constructor for class org.apache.spark.ml.feature.PCA
PCA() - Constructor for class org.apache.spark.ml.feature.PCA
PCA - Class in org.apache.spark.mllib.feature: A feature transformer that projects vectors to a low-dimensional space using PCA.
PCA(int) - Constructor for class org.apache.spark.mllib.feature.PCA
PCAModel - Class in org.apache.spark.ml.feature: Model fitted by PCA.
PCAModel - Class in org.apache.spark.mllib.feature: Model fitted by PCA that can project vectors to a low-dimensional space using PCA.
PCAParams - Interface in org.apache.spark.ml.feature: Params for PCA and PCAModel.
PCAUtil - Class in org.apache.spark.mllib.feature
PCAUtil() - Constructor for class org.apache.spark.mllib.feature.PCAUtil
pdf(Vector) - Method in class org.apache.spark.ml.stat.distribution.MultivariateGaussian: Returns density of this multivariate Gaussian at given point, x
pdf(Vector) - Method in class org.apache.spark.mllib.stat.distribution.MultivariateGaussian: Returns density of this multivariate Gaussian at given point, x
PEAK_EXECUTION_MEMORY() - Static method in class org.apache.spark.InternalAccumulator
PEAK_EXECUTION_MEMORY() - Static method in class org.apache.spark.ui.jobs.TaskDetailsClassNames
PEAK_EXECUTION_MEMORY() - Static method in class org.apache.spark.ui.ToolTips
PEAK_MEM() - Static method in class org.apache.spark.status.TaskIndexNames
peakExecutionMemory() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
peakExecutionMemory() - Method in class org.apache.spark.status.api.v1.TaskMetrics
PEARSON() - Static method in class org.apache.spark.mllib.stat.test.ChiSqTest
PearsonCorrelation - Class in org.apache.spark.mllib.stat.correlation: Compute Pearson correlation for two RDDs of the type RDD[Double] or the correlation matrix for an RDD of the type RDD[Vector].
PearsonCorrelation() - Constructor for class org.apache.spark.mllib.stat.correlation.PearsonCorrelation
percent_rank() - Static method in class org.apache.spark.sql.functions: Window function: returns the relative rank (i.e.
percentile() - Method in interface org.apache.spark.ml.feature.ChiSqSelectorParams: Percentile of features that selector will select, ordered by statistics value descending.
percentile() - Method in class org.apache.spark.mllib.feature.ChiSqSelector
percentiles() - Static method in class org.apache.spark.scheduler.StatsReportListener
percentilesHeader() - Static method in class org.apache.spark.scheduler.StatsReportListener
persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Set this RDD's storage level to persist its values across operations after the first time it is computed.
persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaPairRDD: Set this RDD's storage level to persist its values across operations after the first time it is computed.
persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaRDD: Set this RDD's storage level to persist its values across operations after the first time it is computed.
persist(StorageLevel) - Method in class org.apache.spark.graphx.Graph: Caches the vertices and edges associated with this graph at the specified storage level, ignoring any target storage levels previously set.
persist(StorageLevel) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl: Persists the edge partitions at the specified storage level, ignoring any existing target storage level.
persist(StorageLevel) - Method in class org.apache.spark.graphx.impl.GraphImpl
persist(StorageLevel) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl: Persists the vertex partitions at the specified storage level, ignoring any existing target storage level.
persist(StorageLevel) - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix: Persists the underlying RDD with the specified storage level.
persist(StorageLevel) - Method in class org.apache.spark.rdd.HadoopRDD
persist(StorageLevel) - Method in class org.apache.spark.rdd.NewHadoopRDD
persist(StorageLevel) - Method in class org.apache.spark.rdd.RDD: Set this RDD's storage level to persist its values across operations after the first time it is computed.
persist() - Method in class org.apache.spark.rdd.RDD: Persist this RDD with the default storage level (MEMORY_ONLY).
persist() - Method in class org.apache.spark.sql.Dataset: Persist this Dataset with the default storage level (MEMORY_AND_DISK).
persist(StorageLevel) - Method in class org.apache.spark.sql.Dataset: Persist this Dataset with the given storage level.
persist() - Method in class org.apache.spark.streaming.api.java.JavaDStream: Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
persist(StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaDStream: Persist the RDDs of this DStream with the given storage level
persist() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
persist(StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Persist the RDDs of this DStream with the given storage level
persist(StorageLevel) - Method in class org.apache.spark.streaming.dstream.DStream: Persist the RDDs of this DStream with the given storage level
persist() - Method in class org.apache.spark.streaming.dstream.DStream: Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
personalizedPageRank(long, double, double) - Method in class org.apache.spark.graphx.GraphOps: Run personalized PageRank for a given vertex, such that all random walks are started relative to the source node.
phrase(Parsers.Parser<T>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
pi() - Method in class org.apache.spark.ml.classification.NaiveBayesModel
pi() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
pi() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV1_0$.Data
pi() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV2_0$.Data
pickBin(Partition, RDD<?>, double, DefaultPartitionCoalescer.PartitionLocations) - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer: Takes a parent RDD partition and decides which of the partition groups to put it in Takes locality into account, but also uses power of 2 choices to load balance It strikes a balance between the two using the balanceSlack variable
pickRandomVertex() - Method in class org.apache.spark.graphx.GraphOps: Picks a random vertex from the graph and returns its ID.
pipe(String) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an RDD created by piping elements to a forked external process.
pipe(List<String>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an RDD created by piping elements to a forked external process.
pipe(List<String>, Map<String, String>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an RDD created by piping elements to a forked external process.
pipe(List<String>, Map<String, String>, boolean, int) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an RDD created by piping elements to a forked external process.
pipe(List<String>, Map<String, String>, boolean, int, String) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an RDD created by piping elements to a forked external process.
pipe(String) - Method in class org.apache.spark.rdd.RDD: Return an RDD created by piping elements to a forked external process.
pipe(String, Map<String, String>) - Method in class org.apache.spark.rdd.RDD: Return an RDD created by piping elements to a forked external process.
pipe(Seq<String>, Map<String, String>, Function1<Function1<String, BoxedUnit>, BoxedUnit>, Function2<T, Function1<String, BoxedUnit>, BoxedUnit>, boolean, int, String) - Method in class org.apache.spark.rdd.RDD: Return an RDD created by piping elements to a forked external process.
Pipeline - Class in org.apache.spark.ml: A simple pipeline, which acts as an estimator.
Pipeline(String) - Constructor for class org.apache.spark.ml.Pipeline
Pipeline() - Constructor for class org.apache.spark.ml.Pipeline
Pipeline.SharedReadWrite$ - Class in org.apache.spark.ml: Methods for MLReader and MLWriter shared between Pipeline and PipelineModel
PipelineModel - Class in org.apache.spark.ml: Represents a fitted pipeline.
PipelineStage - Class in org.apache.spark.ml: :: DeveloperApi :: A stage in a pipeline, either an Estimator or a Transformer.
PipelineStage() - Constructor for class org.apache.spark.ml.PipelineStage
pivot(String) - Method in class org.apache.spark.sql.RelationalGroupedDataset: Pivots a column of the current DataFrame and performs the specified aggregation.
pivot(String, Seq<Object>) - Method in class org.apache.spark.sql.RelationalGroupedDataset: Pivots a column of the current DataFrame and performs the specified aggregation.
pivot(String, List<Object>) - Method in class org.apache.spark.sql.RelationalGroupedDataset: (Java-specific) Pivots a column of the current DataFrame and performs the specified aggregation.
pivot(Column) - Method in class org.apache.spark.sql.RelationalGroupedDataset: Pivots a column of the current DataFrame and performs the specified aggregation.
pivot(Column, Seq<Object>) - Method in class org.apache.spark.sql.RelationalGroupedDataset: Pivots a column of the current DataFrame and performs the specified aggregation.
pivot(Column, List<Object>) - Method in class org.apache.spark.sql.RelationalGroupedDataset: (Java-specific) Pivots a column of the current DataFrame and performs the specified aggregation.
PivotType$() - Constructor for class org.apache.spark.sql.RelationalGroupedDataset.PivotType$
plan() - Method in exception org.apache.spark.sql.AnalysisException
planBatchInputPartitions() - Method in interface org.apache.spark.sql.sources.v2.reader.SupportsScanColumnarBatch: Similar to DataSourceReader.planInputPartitions(), but returns columnar data in batches.
planInputPartitions() - Method in interface org.apache.spark.sql.sources.v2.reader.DataSourceReader: Returns a list of InputPartitions.
planInputPartitions() - Method in interface org.apache.spark.sql.sources.v2.reader.SupportsScanColumnarBatch
plus(Object) - Method in class org.apache.spark.sql.Column: Sum of this expression and another expression.
plus(Decimal, Decimal) - Method in interface org.apache.spark.sql.types.Decimal.DecimalIsConflicted
plus(Duration) - Method in class org.apache.spark.streaming.Duration
plus(Duration) - Method in class org.apache.spark.streaming.Time
pmml() - Method in interface org.apache.spark.mllib.pmml.export.PMMLModelExport: Holder of the exported model in PMML format
PMMLExportable - Interface in org.apache.spark.mllib.pmml: :: DeveloperApi :: Export model to the PMML format Predictive Model Markup Language (PMML) is an XML-based file format developed by the Data Mining Group (www.dmg.org).
PMMLKMeansModelWriter - Class in org.apache.spark.ml.clustering: A writer for KMeans that handles the "pmml" format
PMMLKMeansModelWriter() - Constructor for class org.apache.spark.ml.clustering.PMMLKMeansModelWriter
PMMLLinearRegressionModelWriter - Class in org.apache.spark.ml.regression: A writer for LinearRegression that handles the "pmml" format
PMMLLinearRegressionModelWriter() - Constructor for class org.apache.spark.ml.regression.PMMLLinearRegressionModelWriter
PMMLModelExport - Interface in org.apache.spark.mllib.pmml.export
PMMLModelExportFactory - Class in org.apache.spark.mllib.pmml.export
PMMLModelExportFactory() - Constructor for class org.apache.spark.mllib.pmml.export.PMMLModelExportFactory
pmod(Column, Column) - Static method in class org.apache.spark.sql.functions: Returns the positive value of dividend mod divisor.
point() - Method in class org.apache.spark.mllib.feature.VocabWord
POINTS() - Static method in class org.apache.spark.mllib.clustering.StreamingKMeans
pointSilhouetteCoefficient(Set<Object>, double, long, Function1<Object, Object>) - Static method in class org.apache.spark.ml.evaluation.CosineSilhouette
pointSilhouetteCoefficient(Set<Object>, double, long, Function1<Object, Object>) - Static method in class org.apache.spark.ml.evaluation.SquaredEuclideanSilhouette
POISON_PILL() - Static method in class org.apache.spark.scheduler.AsyncEventQueue
Poisson$() - Constructor for class org.apache.spark.ml.regression.GeneralizedLinearRegression.Poisson$
PoissonBounds - Class in org.apache.spark.util.random: Utility functions that help us determine bounds on adjusted sampling rate to guarantee exact sample sizes with high confidence when sampling with replacement.
PoissonBounds() - Constructor for class org.apache.spark.util.random.PoissonBounds
PoissonGenerator - Class in org.apache.spark.mllib.random: :: DeveloperApi :: Generates i.i.d.
PoissonGenerator(double) - Constructor for class org.apache.spark.mllib.random.PoissonGenerator
poissonJavaRDD(JavaSparkContext, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Java-friendly version of RandomRDDs.poissonRDD.
poissonJavaRDD(JavaSparkContext, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.poissonJavaRDD with the default seed.
poissonJavaRDD(JavaSparkContext, double, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.poissonJavaRDD with the default number of partitions and the default seed.
poissonJavaVectorRDD(JavaSparkContext, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Java-friendly version of RandomRDDs.poissonVectorRDD.
poissonJavaVectorRDD(JavaSparkContext, double, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.poissonJavaVectorRDD with the default seed.
poissonJavaVectorRDD(JavaSparkContext, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.poissonJavaVectorRDD with the default number of partitions and the default seed.
poissonRDD(SparkContext, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Generates an RDD comprised of i.i.d. samples from the Poisson distribution with the input mean.
PoissonSampler<T> - Class in org.apache.spark.util.random: :: DeveloperApi :: A sampler for sampling with replacement, based on values drawn from Poisson distribution.
PoissonSampler(double, boolean) - Constructor for class org.apache.spark.util.random.PoissonSampler
PoissonSampler(double) - Constructor for class org.apache.spark.util.random.PoissonSampler
poissonVectorRDD(SparkContext, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Generates an RDD[Vector] with vectors containing i.i.d. samples drawn from the Poisson distribution with the input mean.
PolynomialExpansion - Class in org.apache.spark.ml.feature: Perform feature expansion in a polynomial space.
PolynomialExpansion(String) - Constructor for class org.apache.spark.ml.feature.PolynomialExpansion
PolynomialExpansion() - Constructor for class org.apache.spark.ml.feature.PolynomialExpansion
popStdev() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Compute the population standard deviation of this RDD's elements.
popStdev() - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Compute the population standard deviation of this RDD's elements.
popStdev() - Method in class org.apache.spark.util.StatCounter: Return the population standard deviation of the values.
popVariance() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Compute the population variance of this RDD's elements.
popVariance() - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Compute the population variance of this RDD's elements.
popVariance() - Method in class org.apache.spark.util.StatCounter: Return the population variance of the values.
port() - Method in interface org.apache.spark.SparkExecutorInfo
port() - Method in class org.apache.spark.SparkExecutorInfoImpl
port() - Method in class org.apache.spark.storage.BlockManagerId
PortableDataStream - Class in org.apache.spark.input: A class that allows DataStreams to be serialized and moved around by not creating them until they need to be read
PortableDataStream(CombineFileSplit, TaskAttemptContext, Integer) - Constructor for class org.apache.spark.input.PortableDataStream
portMaxRetries(SparkConf) - Static method in class org.apache.spark.util.Utils: Maximum number of retries when binding to a port before giving up.
posexplode(Column) - Static method in class org.apache.spark.sql.functions: Creates a new row for each element with position in the given array or map column.
posexplode_outer(Column) - Static method in class org.apache.spark.sql.functions: Creates a new row for each element with position in the given array or map column.
position() - Method in class org.apache.spark.storage.ReadableChannelFileRegion
positioned(Function0<Parsers.Parser<T>>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
post(SparkListenerEvent) - Method in class org.apache.spark.scheduler.AsyncEventQueue
Postfix$() - Constructor for class org.apache.spark.mllib.fpm.PrefixSpan.Postfix$
PostgresDialect - Class in org.apache.spark.sql.jdbc
PostgresDialect() - Constructor for class org.apache.spark.sql.jdbc.PostgresDialect
postStartHook() - Method in interface org.apache.spark.scheduler.TaskScheduler
postToAll(E) - Method in interface org.apache.spark.util.ListenerBus: Post the event to all registered listeners.
pow(Column, Column) - Static method in class org.apache.spark.sql.functions: Returns the value of the first argument raised to the power of the second argument.
pow(Column, String) - Static method in class org.apache.spark.sql.functions: Returns the value of the first argument raised to the power of the second argument.
pow(String, Column) - Static method in class org.apache.spark.sql.functions: Returns the value of the first argument raised to the power of the second argument.
pow(String, String) - Static method in class org.apache.spark.sql.functions: Returns the value of the first argument raised to the power of the second argument.
pow(Column, double) - Static method in class org.apache.spark.sql.functions: Returns the value of the first argument raised to the power of the second argument.
pow(String, double) - Static method in class org.apache.spark.sql.functions: Returns the value of the first argument raised to the power of the second argument.
pow(double, Column) - Static method in class org.apache.spark.sql.functions: Returns the value of the first argument raised to the power of the second argument.
pow(double, String) - Static method in class org.apache.spark.sql.functions: Returns the value of the first argument raised to the power of the second argument.
POW_10() - Static method in class org.apache.spark.sql.types.Decimal
PowerIterationClustering - Class in org.apache.spark.ml.clustering: :: Experimental :: Power Iteration Clustering (PIC), a scalable graph clustering algorithm developed by Lin and Cohen.
PowerIterationClustering() - Constructor for class org.apache.spark.ml.clustering.PowerIterationClustering
PowerIterationClustering - Class in org.apache.spark.mllib.clustering: Power Iteration Clustering (PIC), a scalable graph clustering algorithm developed by Lin and Cohen.
PowerIterationClustering() - Constructor for class org.apache.spark.mllib.clustering.PowerIterationClustering: Constructs a PIC instance with default parameters: {k: 2, maxIterations: 100, initMode: "random"}.
PowerIterationClustering.Assignment - Class in org.apache.spark.mllib.clustering: Cluster assignment.
PowerIterationClustering.Assignment$ - Class in org.apache.spark.mllib.clustering
PowerIterationClusteringModel - Class in org.apache.spark.mllib.clustering: Model produced by PowerIterationClustering.
PowerIterationClusteringModel(int, RDD<PowerIterationClustering.Assignment>) - Constructor for class org.apache.spark.mllib.clustering.PowerIterationClusteringModel
PowerIterationClusteringModel.SaveLoadV1_0$ - Class in org.apache.spark.mllib.clustering
PowerIterationClusteringParams - Interface in org.apache.spark.ml.clustering: Common params for PowerIterationClustering
pr() - Method in interface org.apache.spark.ml.classification.BinaryLogisticRegressionSummary: Returns the precision-recall curve, which is a Dataframe containing two fields recall, precision with (0.0, 1.0) prepended to it.
pr() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Returns the precision-recall curve, which is an RDD of (recall, precision), NOT (precision, recall), with (0.0, p) prepended to it, where p is the precision associated with the lowest recall on the curve.
preciseSize() - Method in interface org.apache.spark.storage.memory.MemoryEntryBuilder
Precision - Class in org.apache.spark.mllib.evaluation.binary: Precision.
Precision() - Constructor for class org.apache.spark.mllib.evaluation.binary.Precision
precision(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns precision for a given label (category)
precision() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Deprecated.
Use accuracy. Since 2.0.0.
precision() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics: Returns document-based precision averaged by the number of documents
precision(double) - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics: Returns precision for a given label (category)
precision() - Method in class org.apache.spark.sql.types.Decimal
precision() - Method in class org.apache.spark.sql.types.DecimalType
precisionAt(int) - Method in class org.apache.spark.mllib.evaluation.RankingMetrics: Compute the average precision of all the queries, truncated at ranking position k.
precisionByLabel() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary: Returns precision for each label (category).
precisionByThreshold() - Method in interface org.apache.spark.ml.classification.BinaryLogisticRegressionSummary: Returns a dataframe with two fields (threshold, precision) curve.
precisionByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Returns the (threshold, precision) curve.
predict(Vector) - Method in interface org.apache.spark.ml.ann.TopologyModel: Prediction of the model.
predict(FeaturesType) - Method in class org.apache.spark.ml.classification.ClassificationModel: Predict label for the given features.
predict(Vector) - Method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
predict(Vector) - Method in class org.apache.spark.ml.classification.GBTClassificationModel
predict(Vector) - Method in class org.apache.spark.ml.classification.LinearSVCModel
predict(Vector) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel: Predict label for the given feature vector.
predict(Vector) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassificationModel: Predict label for the given features.
predict(FeaturesType) - Method in class org.apache.spark.ml.PredictionModel: Predict label for the given features.
predict(Vector) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
predict(Vector) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
predict(Vector) - Method in class org.apache.spark.ml.regression.GBTRegressionModel
predict(Vector) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionModel
predict(Vector) - Method in class org.apache.spark.ml.regression.LinearRegressionModel
predict(Vector) - Method in class org.apache.spark.ml.regression.RandomForestRegressionModel
predict(RDD<Vector>) - Method in interface org.apache.spark.mllib.classification.ClassificationModel: Predict values for the given data set using the model trained.
predict(Vector) - Method in interface org.apache.spark.mllib.classification.ClassificationModel: Predict values for a single data point using the model trained.
predict(JavaRDD<Vector>) - Method in interface org.apache.spark.mllib.classification.ClassificationModel: Predict values for examples stored in a JavaRDD.
predict(RDD<Vector>) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
predict(Vector) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
predict(Vector) - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel: Predicts the index of the cluster that the input point belongs to.
predict(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel: Predicts the indices of the clusters that the input points belong to.
predict(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel: Java-friendly version of predict().
predict(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel: Maps given points to their cluster indices.
predict(Vector) - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel: Maps given point to its cluster index.
predict(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel: Java-friendly version of predict()
predict(Vector) - Method in class org.apache.spark.mllib.clustering.KMeansModel: Returns the cluster index that a given point belongs to.
predict(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel: Maps given points to their cluster indices.
predict(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel: Maps given points to their cluster indices.
predict(int, int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel: Predict the rating of one user for one product.
predict(RDD<Tuple2<Object, Object>>) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel: Predict the rating of many users for many products.
predict(JavaPairRDD<Integer, Integer>) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel: Java-friendly version of MatrixFactorizationModel.predict.
predict(RDD<Vector>) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel: Predict values for the given data set using the model trained.
predict(Vector) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel: Predict values for a single data point using the model trained.
predict(RDD<Object>) - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel: Predict labels for provided features.
predict(JavaDoubleRDD) - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel: Predict labels for provided features.
predict(double) - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel: Predict a single label.
predict(RDD<Vector>) - Method in interface org.apache.spark.mllib.regression.RegressionModel: Predict values for the given data set using the model trained.
predict(Vector) - Method in interface org.apache.spark.mllib.regression.RegressionModel: Predict values for a single data point using the model trained.
predict(JavaRDD<Vector>) - Method in interface org.apache.spark.mllib.regression.RegressionModel: Predict values for examples stored in a JavaRDD.
predict(Vector) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel: Predict values for a single data point using the model trained.
predict(RDD<Vector>) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel: Predict values for the given data set using the model trained.
predict(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel: Predict values for the given data set using the model trained.
predict() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
predict() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.PredictData
predict() - Method in class org.apache.spark.mllib.tree.model.Node
predict(Vector) - Method in class org.apache.spark.mllib.tree.model.Node: predict value if node is not leaf
Predict - Class in org.apache.spark.mllib.tree.model: :: DeveloperApi :: Predicted value for a node param: predict predicted value param: prob probability of the label (classification only)
Predict(double, double) - Constructor for class org.apache.spark.mllib.tree.model.Predict
predict() - Method in class org.apache.spark.mllib.tree.model.Predict
PredictData(double, double) - Constructor for class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.PredictData
PredictData$() - Constructor for class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.PredictData$
prediction() - Method in class org.apache.spark.ml.tree.DecisionTreeModelReadWrite.NodeData
prediction() - Method in class org.apache.spark.ml.tree.InternalNode
prediction() - Method in class org.apache.spark.ml.tree.LeafNode
prediction() - Method in class org.apache.spark.ml.tree.Node: Prediction a leaf node makes, or which an internal node would make if it were a leaf node
predictionCol() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary: Field in "predictions" which gives the prediction of each class.
predictionCol() - Method in class org.apache.spark.ml.classification.LogisticRegressionSummaryImpl
predictionCol() - Method in class org.apache.spark.ml.clustering.ClusteringSummary
predictionCol() - Method in interface org.apache.spark.ml.param.shared.HasPredictionCol: Param for prediction column name.
predictionCol() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionSummary: Field in "predictions" which gives the predicted value of each instance.
predictionCol() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary
PredictionModel<FeaturesType,M extends PredictionModel<FeaturesType,M>> - Class in org.apache.spark.ml: :: DeveloperApi :: Abstraction for a model for prediction tasks (regression and classification).
PredictionModel() - Constructor for class org.apache.spark.ml.PredictionModel
predictions() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary: Dataframe output by the model's transform method.
predictions() - Method in class org.apache.spark.ml.classification.LogisticRegressionSummaryImpl
predictions() - Method in class org.apache.spark.ml.clustering.ClusteringSummary
predictions() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionSummary: Predictions output by the model's transform method.
predictions() - Method in class org.apache.spark.ml.regression.IsotonicRegressionModel: Predictions associated with the boundaries at the same index, monotone because of isotonic regression.
predictions() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary
predictions() - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
predictOn(DStream<Vector>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans: Use the clustering model to make predictions on batches of data from a DStream.
predictOn(JavaDStream<Vector>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans: Java-friendly version of predictOn.
predictOn(DStream<Vector>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm: Use the model to make predictions on batches of data from a DStream
predictOn(JavaDStream<Vector>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm: Java-friendly version of predictOn.
predictOnValues(DStream<Tuple2<K, Vector>>, ClassTag<K>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans: Use the model to make predictions on the values of a DStream and carry over its keys.
predictOnValues(JavaPairDStream<K, Vector>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans: Java-friendly version of predictOnValues.
predictOnValues(DStream<Tuple2<K, Vector>>, ClassTag<K>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm: Use the model to make predictions on the values of a DStream and carry over its keys.
predictOnValues(JavaPairDStream<K, Vector>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm: Java-friendly version of predictOnValues.
Predictor<FeaturesType,Learner extends Predictor<FeaturesType,Learner,M>,M extends PredictionModel<FeaturesType,M>> - Class in org.apache.spark.ml: :: DeveloperApi :: Abstraction for prediction problems (regression and classification).
Predictor() - Constructor for class org.apache.spark.ml.Predictor
PredictorParams - Interface in org.apache.spark.ml: (private[ml]) Trait for parameters for prediction (regression and classification).
predictProbabilities(RDD<Vector>) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel: Predict values for the given data set using the model trained.
predictProbabilities(Vector) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel: Predict posterior class probabilities for a single data point using the model trained.
predictQuantiles(Vector) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
predictRaw(Vector) - Method in interface org.apache.spark.ml.ann.TopologyModel: Raw prediction of the model.
predictSoft(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel: Given the input vectors, return the membership value of each vector to all mixture components.
predictSoft(Vector) - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel: Given the input vector, return the membership values to all mixture components.
preferredLocation() - Method in class org.apache.spark.streaming.receiver.Receiver: Override this to specify a preferred location (hostname).
preferredLocations(Partition) - Method in class org.apache.spark.rdd.RDD: Get the preferred locations of a partition, taking into account whether the RDD is checkpointed.
preferredLocations() - Method in interface org.apache.spark.sql.sources.v2.reader.InputPartition: The preferred locations where the input partition reader returned by this partition can run faster, but Spark does not guarantee to run the input partition reader on these locations.
Prefix$() - Constructor for class org.apache.spark.mllib.fpm.PrefixSpan.Prefix$
prefixesToRewrite() - Method in class org.apache.spark.ml.feature.VectorAttributeRewriter
PrefixSpan - Class in org.apache.spark.ml.fpm: :: Experimental :: A parallel PrefixSpan algorithm to mine frequent sequential patterns.
PrefixSpan(String) - Constructor for class org.apache.spark.ml.fpm.PrefixSpan
PrefixSpan() - Constructor for class org.apache.spark.ml.fpm.PrefixSpan
PrefixSpan - Class in org.apache.spark.mllib.fpm: A parallel PrefixSpan algorithm to mine frequent sequential patterns.
PrefixSpan() - Constructor for class org.apache.spark.mllib.fpm.PrefixSpan: Constructs a default instance with default parameters {minSupport: 0.1, maxPatternLength: 10, maxLocalProjDBSize: 32000000L}.
PrefixSpan.FreqSequence<Item> - Class in org.apache.spark.mllib.fpm: Represents a frequent sequence.
PrefixSpan.Postfix$ - Class in org.apache.spark.mllib.fpm
PrefixSpan.Prefix$ - Class in org.apache.spark.mllib.fpm
PrefixSpanModel<Item> - Class in org.apache.spark.mllib.fpm: Model fitted by PrefixSpan param: freqSequences frequent sequences
PrefixSpanModel(RDD<PrefixSpan.FreqSequence<Item>>) - Constructor for class org.apache.spark.mllib.fpm.PrefixSpanModel
PrefixSpanModel.SaveLoadV1_0$ - Class in org.apache.spark.mllib.fpm
prefLoc() - Method in class org.apache.spark.rdd.PartitionGroup
pregel(A, int, EdgeDirection, Function3<Object, VD, A, VD>, Function1<EdgeTriplet<VD, ED>, Iterator<Tuple2<Object, A>>>, Function2<A, A, A>, ClassTag<A>) - Method in class org.apache.spark.graphx.GraphOps: Execute a Pregel-like iterative vertex-parallel abstraction.
Pregel - Class in org.apache.spark.graphx: Implements a Pregel-like bulk-synchronous message-passing API.
Pregel() - Constructor for class org.apache.spark.graphx.Pregel
prepareWritable(Writable, Seq<Tuple2<String, String>>) - Static method in class org.apache.spark.sql.hive.HiveShim
prepareWrite(SparkSession, Job, Map<String, String>, StructType) - Method in class org.apache.spark.sql.hive.execution.HiveFileFormat
prepareWrite(SparkSession, Job, Map<String, String>, StructType) - Method in class org.apache.spark.sql.hive.orc.OrcFileFormat
prependBaseUri(HttpServletRequest, String, String) - Static method in class org.apache.spark.ui.UIUtils
prettyJson() - Method in class org.apache.spark.sql.streaming.SinkProgress: The pretty (i.e.
prettyJson() - Method in class org.apache.spark.sql.streaming.SourceProgress: The pretty (i.e.
prettyJson() - Method in class org.apache.spark.sql.streaming.StateOperatorProgress: The pretty (i.e.
prettyJson() - Method in class org.apache.spark.sql.streaming.StreamingQueryProgress: The pretty (i.e.
prettyJson() - Method in class org.apache.spark.sql.streaming.StreamingQueryStatus: The pretty (i.e.
prettyJson() - Static method in class org.apache.spark.sql.types.BinaryType
prettyJson() - Static method in class org.apache.spark.sql.types.BooleanType
prettyJson() - Static method in class org.apache.spark.sql.types.ByteType
prettyJson() - Static method in class org.apache.spark.sql.types.CalendarIntervalType
prettyJson() - Method in class org.apache.spark.sql.types.DataType: The pretty (i.e.
prettyJson() - Static method in class org.apache.spark.sql.types.DateType
prettyJson() - Static method in class org.apache.spark.sql.types.DoubleType
prettyJson() - Static method in class org.apache.spark.sql.types.FloatType
prettyJson() - Static method in class org.apache.spark.sql.types.IntegerType
prettyJson() - Static method in class org.apache.spark.sql.types.LongType
prettyJson() - Static method in class org.apache.spark.sql.types.NullType
prettyJson() - Static method in class org.apache.spark.sql.types.ShortType
prettyJson() - Static method in class org.apache.spark.sql.types.StringType
prettyJson() - Static method in class org.apache.spark.sql.types.TimestampType
prettyPrint() - Method in class org.apache.spark.streaming.Duration
prev() - Method in class org.apache.spark.rdd.ShuffledRDD
prev() - Method in class org.apache.spark.status.LiveRDDPartition
prevPageSizeFormField() - Method in interface org.apache.spark.ui.PagedTable
print() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Print the first ten elements of each RDD generated in this DStream.
print(int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Print the first num elements of each RDD generated in this DStream.
print() - Method in class org.apache.spark.streaming.dstream.DStream: Print the first ten elements of each RDD generated in this DStream.
print(int) - Method in class org.apache.spark.streaming.dstream.DStream: Print the first num elements of each RDD generated in this DStream.
printErrorAndExit(String) - Method in interface org.apache.spark.util.CommandLineUtils
printMessage(String) - Method in interface org.apache.spark.util.CommandLineUtils
printSchema() - Method in class org.apache.spark.sql.Dataset: Prints the schema to the console in a nice tree format.
printStats() - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
printStream() - Method in interface org.apache.spark.util.CommandLineUtils
printTreeString() - Method in class org.apache.spark.sql.types.StructType
prioritize(BlockManagerId, Seq<BlockManagerId>, HashSet<BlockManagerId>, BlockId, int) - Method in class org.apache.spark.storage.BasicBlockReplicationPolicy: Method to prioritize a bunch of candidate peers of a block manager.
prioritize(BlockManagerId, Seq<BlockManagerId>, HashSet<BlockManagerId>, BlockId, int) - Method in interface org.apache.spark.storage.BlockReplicationPolicy: Method to prioritize a bunch of candidate peers of a block
prioritize(BlockManagerId, Seq<BlockManagerId>, HashSet<BlockManagerId>, BlockId, int) - Method in class org.apache.spark.storage.RandomBlockReplicationPolicy: Method to prioritize a bunch of candidate peers of a block.
priority() - Method in interface org.apache.spark.scheduler.Schedulable
prob() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.PredictData
prob() - Method in class org.apache.spark.mllib.tree.model.Predict
ProbabilisticClassificationModel<FeaturesType,M extends ProbabilisticClassificationModel<FeaturesType,M>> - Class in org.apache.spark.ml.classification: :: DeveloperApi ::
ProbabilisticClassificationModel() - Constructor for class org.apache.spark.ml.classification.ProbabilisticClassificationModel
ProbabilisticClassifier<FeaturesType,E extends ProbabilisticClassifier<FeaturesType,E,M>,M extends ProbabilisticClassificationModel<FeaturesType,M>> - Class in org.apache.spark.ml.classification: :: DeveloperApi ::
ProbabilisticClassifier() - Constructor for class org.apache.spark.ml.classification.ProbabilisticClassifier
ProbabilisticClassifierParams - Interface in org.apache.spark.ml.classification: (private[classification]) Params for probabilistic classification.
probabilities() - Static method in class org.apache.spark.scheduler.StatsReportListener
probability() - Method in class org.apache.spark.ml.clustering.GaussianMixtureSummary: Probability of each cluster.
probabilityCol() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary: Field in "predictions" which gives the probability of each class as a vector.
probabilityCol() - Method in class org.apache.spark.ml.classification.LogisticRegressionSummaryImpl
probabilityCol() - Method in class org.apache.spark.ml.clustering.GaussianMixtureSummary
probabilityCol() - Method in interface org.apache.spark.ml.param.shared.HasProbabilityCol: Param for Column name for predicted class conditional probabilities.
Probit$() - Constructor for class org.apache.spark.ml.regression.GeneralizedLinearRegression.Probit$
process(T) - Method in class org.apache.spark.sql.ForeachWriter: Called to process the data in the executor side.
PROCESS_LOCAL() - Static method in class org.apache.spark.scheduler.TaskLocality
processAllAvailable() - Method in interface org.apache.spark.sql.streaming.StreamingQuery: Blocks until all available data in the source has been processed and committed to the sink.
processedRowsPerSecond() - Method in class org.apache.spark.sql.streaming.SourceProgress
processedRowsPerSecond() - Method in class org.apache.spark.sql.streaming.StreamingQueryProgress: The aggregate (across all sources) rate at which Spark is processing data.
processingDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo: Time taken for the all jobs of this batch to finish processing from the time they started processing.
processingEndTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
processingStartTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
ProcessingTime - Class in org.apache.spark.sql.streaming: Deprecated.
use Trigger.ProcessingTime(intervalMs). Since 2.2.0.
ProcessingTime(long) - Constructor for class org.apache.spark.sql.streaming.ProcessingTime: Deprecated.
ProcessingTime(long) - Static method in class org.apache.spark.sql.streaming.Trigger: A trigger policy that runs a query periodically based on an interval in processing time.
ProcessingTime(long, TimeUnit) - Static method in class org.apache.spark.sql.streaming.Trigger: (Java-friendly) A trigger policy that runs a query periodically based on an interval in processing time.
ProcessingTime(Duration) - Static method in class org.apache.spark.sql.streaming.Trigger: (Scala-friendly) A trigger policy that runs a query periodically based on an interval in processing time.
ProcessingTime(String) - Static method in class org.apache.spark.sql.streaming.Trigger: A trigger policy that runs a query periodically based on an interval in processing time.
processingTime() - Method in class org.apache.spark.status.api.v1.streaming.BatchInfo
ProcessingTimeTimeout() - Static method in class org.apache.spark.sql.streaming.GroupStateTimeout: Timeout based on processing time.
processStreamByLine(String, InputStream, Function1<String, BoxedUnit>) - Static method in class org.apache.spark.util.Utils: Return and start a daemon thread that processes the content of the input stream line by line.
producedAttributes() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformationExec
product() - Method in class org.apache.spark.mllib.recommendation.Rating
product(TypeTags.TypeTag<T>) - Static method in class org.apache.spark.sql.Encoders: An encoder for Scala's product type (tuples, case classes, etc).
productArity() - Static method in class org.apache.spark.ExpireDeadHosts
productArity() - Static method in class org.apache.spark.ml.feature.Dot
productArity() - Static method in class org.apache.spark.Resubmitted
productArity() - Static method in class org.apache.spark.rpc.netty.OnStart
productArity() - Static method in class org.apache.spark.rpc.netty.OnStop
productArity() - Static method in class org.apache.spark.scheduler.AllJobsCancelled
productArity() - Static method in class org.apache.spark.scheduler.JobSucceeded
productArity() - Static method in class org.apache.spark.scheduler.ResubmitFailedStages
productArity() - Static method in class org.apache.spark.scheduler.StopCoordinator
productArity() - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
productArity() - Static method in class org.apache.spark.sql.jdbc.OracleDialect
productArity() - Static method in class org.apache.spark.sql.jdbc.TeradataDialect
productArity() - Static method in class org.apache.spark.sql.types.BinaryType
productArity() - Static method in class org.apache.spark.sql.types.BooleanType
productArity() - Static method in class org.apache.spark.sql.types.ByteType
productArity() - Static method in class org.apache.spark.sql.types.CalendarIntervalType
productArity() - Static method in class org.apache.spark.sql.types.DateType
productArity() - Static method in class org.apache.spark.sql.types.DoubleType
productArity() - Static method in class org.apache.spark.sql.types.FloatType
productArity() - Static method in class org.apache.spark.sql.types.IntegerType
productArity() - Static method in class org.apache.spark.sql.types.LongType
productArity() - Static method in class org.apache.spark.sql.types.NullType
productArity() - Static method in class org.apache.spark.sql.types.ShortType
productArity() - Static method in class org.apache.spark.sql.types.StringType
productArity() - Static method in class org.apache.spark.sql.types.TimestampType
productArity() - Static method in class org.apache.spark.StopMapOutputTracker
productArity() - Static method in class org.apache.spark.streaming.kinesis.DefaultCredentials
productArity() - Static method in class org.apache.spark.streaming.scheduler.AllReceiverIds
productArity() - Static method in class org.apache.spark.streaming.scheduler.GetAllReceiverInfo
productArity() - Static method in class org.apache.spark.streaming.scheduler.StopAllReceivers
productArity() - Static method in class org.apache.spark.Success
productArity() - Static method in class org.apache.spark.TaskResultLost
productArity() - Static method in class org.apache.spark.TaskSchedulerIsSet
productArity() - Static method in class org.apache.spark.UnknownReason
productElement(int) - Static method in class org.apache.spark.ExpireDeadHosts
productElement(int) - Static method in class org.apache.spark.ml.feature.Dot
productElement(int) - Static method in class org.apache.spark.Resubmitted
productElement(int) - Static method in class org.apache.spark.rpc.netty.OnStart
productElement(int) - Static method in class org.apache.spark.rpc.netty.OnStop
productElement(int) - Static method in class org.apache.spark.scheduler.AllJobsCancelled
productElement(int) - Static method in class org.apache.spark.scheduler.JobSucceeded
productElement(int) - Static method in class org.apache.spark.scheduler.ResubmitFailedStages
productElement(int) - Static method in class org.apache.spark.scheduler.StopCoordinator
productElement(int) - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
productElement(int) - Static method in class org.apache.spark.sql.jdbc.OracleDialect
productElement(int) - Static method in class org.apache.spark.sql.jdbc.TeradataDialect
productElement(int) - Static method in class org.apache.spark.sql.types.BinaryType
productElement(int) - Static method in class org.apache.spark.sql.types.BooleanType
productElement(int) - Static method in class org.apache.spark.sql.types.ByteType
productElement(int) - Static method in class org.apache.spark.sql.types.CalendarIntervalType
productElement(int) - Static method in class org.apache.spark.sql.types.DateType
productElement(int) - Static method in class org.apache.spark.sql.types.DoubleType
productElement(int) - Static method in class org.apache.spark.sql.types.FloatType
productElement(int) - Static method in class org.apache.spark.sql.types.IntegerType
productElement(int) - Static method in class org.apache.spark.sql.types.LongType
productElement(int) - Static method in class org.apache.spark.sql.types.NullType
productElement(int) - Static method in class org.apache.spark.sql.types.ShortType
productElement(int) - Static method in class org.apache.spark.sql.types.StringType
productElement(int) - Static method in class org.apache.spark.sql.types.TimestampType
productElement(int) - Static method in class org.apache.spark.StopMapOutputTracker
productElement(int) - Static method in class org.apache.spark.streaming.kinesis.DefaultCredentials
productElement(int) - Static method in class org.apache.spark.streaming.scheduler.AllReceiverIds
productElement(int) - Static method in class org.apache.spark.streaming.scheduler.GetAllReceiverInfo
productElement(int) - Static method in class org.apache.spark.streaming.scheduler.StopAllReceivers
productElement(int) - Static method in class org.apache.spark.Success
productElement(int) - Static method in class org.apache.spark.TaskResultLost
productElement(int) - Static method in class org.apache.spark.TaskSchedulerIsSet
productElement(int) - Static method in class org.apache.spark.UnknownReason
productFeatures() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
productIterator() - Static method in class org.apache.spark.ExpireDeadHosts
productIterator() - Static method in class org.apache.spark.ml.feature.Dot
productIterator() - Static method in class org.apache.spark.Resubmitted
productIterator() - Static method in class org.apache.spark.rpc.netty.OnStart
productIterator() - Static method in class org.apache.spark.rpc.netty.OnStop
productIterator() - Static method in class org.apache.spark.scheduler.AllJobsCancelled
productIterator() - Static method in class org.apache.spark.scheduler.JobSucceeded
productIterator() - Static method in class org.apache.spark.scheduler.ResubmitFailedStages
productIterator() - Static method in class org.apache.spark.scheduler.StopCoordinator
productIterator() - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
productIterator() - Static method in class org.apache.spark.sql.jdbc.OracleDialect
productIterator() - Static method in class org.apache.spark.sql.jdbc.TeradataDialect
productIterator() - Static method in class org.apache.spark.sql.types.BinaryType
productIterator() - Static method in class org.apache.spark.sql.types.BooleanType
productIterator() - Static method in class org.apache.spark.sql.types.ByteType
productIterator() - Static method in class org.apache.spark.sql.types.CalendarIntervalType
productIterator() - Static method in class org.apache.spark.sql.types.DateType
productIterator() - Static method in class org.apache.spark.sql.types.DoubleType
productIterator() - Static method in class org.apache.spark.sql.types.FloatType
productIterator() - Static method in class org.apache.spark.sql.types.IntegerType
productIterator() - Static method in class org.apache.spark.sql.types.LongType
productIterator() - Static method in class org.apache.spark.sql.types.NullType
productIterator() - Static method in class org.apache.spark.sql.types.ShortType
productIterator() - Static method in class org.apache.spark.sql.types.StringType
productIterator() - Static method in class org.apache.spark.sql.types.TimestampType
productIterator() - Static method in class org.apache.spark.StopMapOutputTracker
productIterator() - Static method in class org.apache.spark.streaming.kinesis.DefaultCredentials
productIterator() - Static method in class org.apache.spark.streaming.scheduler.AllReceiverIds
productIterator() - Static method in class org.apache.spark.streaming.scheduler.GetAllReceiverInfo
productIterator() - Static method in class org.apache.spark.streaming.scheduler.StopAllReceivers
productIterator() - Static method in class org.apache.spark.Success
productIterator() - Static method in class org.apache.spark.TaskResultLost
productIterator() - Static method in class org.apache.spark.TaskSchedulerIsSet
productIterator() - Static method in class org.apache.spark.UnknownReason
productPrefix() - Static method in class org.apache.spark.ExpireDeadHosts
productPrefix() - Static method in class org.apache.spark.ml.feature.Dot
productPrefix() - Static method in class org.apache.spark.Resubmitted
productPrefix() - Static method in class org.apache.spark.rpc.netty.OnStart
productPrefix() - Static method in class org.apache.spark.rpc.netty.OnStop
productPrefix() - Static method in class org.apache.spark.scheduler.AllJobsCancelled
productPrefix() - Static method in class org.apache.spark.scheduler.JobSucceeded
productPrefix() - Static method in class org.apache.spark.scheduler.ResubmitFailedStages
productPrefix() - Static method in class org.apache.spark.scheduler.StopCoordinator
productPrefix() - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
productPrefix() - Static method in class org.apache.spark.sql.jdbc.OracleDialect
productPrefix() - Static method in class org.apache.spark.sql.jdbc.TeradataDialect
productPrefix() - Static method in class org.apache.spark.sql.types.BinaryType
productPrefix() - Static method in class org.apache.spark.sql.types.BooleanType
productPrefix() - Static method in class org.apache.spark.sql.types.ByteType
productPrefix() - Static method in class org.apache.spark.sql.types.CalendarIntervalType
productPrefix() - Static method in class org.apache.spark.sql.types.DateType
productPrefix() - Static method in class org.apache.spark.sql.types.DoubleType
productPrefix() - Static method in class org.apache.spark.sql.types.FloatType
productPrefix() - Static method in class org.apache.spark.sql.types.IntegerType
productPrefix() - Static method in class org.apache.spark.sql.types.LongType
productPrefix() - Static method in class org.apache.spark.sql.types.NullType
productPrefix() - Static method in class org.apache.spark.sql.types.ShortType
productPrefix() - Static method in class org.apache.spark.sql.types.StringType
productPrefix() - Static method in class org.apache.spark.sql.types.TimestampType
productPrefix() - Static method in class org.apache.spark.StopMapOutputTracker
productPrefix() - Static method in class org.apache.spark.streaming.kinesis.DefaultCredentials
productPrefix() - Static method in class org.apache.spark.streaming.scheduler.AllReceiverIds
productPrefix() - Static method in class org.apache.spark.streaming.scheduler.GetAllReceiverInfo
productPrefix() - Static method in class org.apache.spark.streaming.scheduler.StopAllReceivers
productPrefix() - Static method in class org.apache.spark.Success
productPrefix() - Static method in class org.apache.spark.TaskResultLost
productPrefix() - Static method in class org.apache.spark.TaskSchedulerIsSet
productPrefix() - Static method in class org.apache.spark.UnknownReason
progress() - Method in class org.apache.spark.sql.streaming.StreamingQueryListener.QueryProgressEvent
project(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Binomial$
project(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Gaussian$
properties() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
properties() - Method in class org.apache.spark.scheduler.SparkListenerStageSubmitted
propertiesFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
propertiesToJson(Properties) - Static method in class org.apache.spark.util.JsonProtocol
provider() - Static method in class org.apache.spark.streaming.kinesis.DefaultCredentials
provider() - Method in interface org.apache.spark.streaming.kinesis.SparkAWSCredentials: Return an AWSCredentialProvider instance that can be used by the Kinesis Client Library to authenticate to AWS services (Kinesis, CloudWatch and DynamoDB).
proxyBase() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter
pruneColumns(StructType) - Method in interface org.apache.spark.sql.sources.v2.reader.SupportsPushDownRequiredColumns: Applies column pruning w.r.t.
PrunedFilteredScan - Interface in org.apache.spark.sql.sources: A BaseRelation that can eliminate unneeded columns and filter using selected predicates before producing an RDD containing all matching tuples as Row objects.
PrunedScan - Interface in org.apache.spark.sql.sources: A BaseRelation that can eliminate unneeded columns before producing an RDD containing all of its tuples as Row objects.
Pseudorandom - Interface in org.apache.spark.util.random: :: DeveloperApi :: A class with pseudorandom behavior.
pushedFilters() - Method in interface org.apache.spark.sql.sources.v2.reader.SupportsPushDownFilters: Returns the filters that are pushed to the data source via SupportsPushDownFilters.pushFilters(Filter[]).
pushFilters(Filter[]) - Method in interface org.apache.spark.sql.sources.v2.reader.SupportsPushDownFilters: Pushes down filters, and returns filters that need to be evaluated after scanning.
put(ParamPair<?>...) - Method in class org.apache.spark.ml.param.ParamMap: Puts a list of param pairs (overwrites if the input params exists).
put(Param<T>, T) - Method in class org.apache.spark.ml.param.ParamMap: Puts a (param, value) pair (overwrites if the input param exists).
put(Seq<ParamPair<?>>) - Method in class org.apache.spark.ml.param.ParamMap: Puts a list of param pairs (overwrites if the input params exists).
put(Object) - Method in class org.apache.spark.util.sketch.BloomFilter: Puts an item into this BloomFilter.
putBinary(byte[]) - Method in class org.apache.spark.util.sketch.BloomFilter: A specialized variant of BloomFilter.put(Object) that only supports byte array items.
putBoolean(String, boolean) - Method in class org.apache.spark.sql.types.MetadataBuilder: Puts a Boolean.
putBooleanArray(String, boolean[]) - Method in class org.apache.spark.sql.types.MetadataBuilder: Puts a Boolean array.
putDouble(String, double) - Method in class org.apache.spark.sql.types.MetadataBuilder: Puts a Double.
putDoubleArray(String, double[]) - Method in class org.apache.spark.sql.types.MetadataBuilder: Puts a Double array.
putLong(String, long) - Method in class org.apache.spark.sql.types.MetadataBuilder: Puts a Long.
putLong(long) - Method in class org.apache.spark.util.sketch.BloomFilter: A specialized variant of BloomFilter.put(Object) that only supports long items.
putLongArray(String, long[]) - Method in class org.apache.spark.sql.types.MetadataBuilder: Puts a Long array.
putMetadata(String, Metadata) - Method in class org.apache.spark.sql.types.MetadataBuilder: Puts a Metadata.
putMetadataArray(String, Metadata[]) - Method in class org.apache.spark.sql.types.MetadataBuilder: Puts a Metadata array.
putNull(String) - Method in class org.apache.spark.sql.types.MetadataBuilder: Puts a null.
putString(String, String) - Method in class org.apache.spark.sql.types.MetadataBuilder: Puts a String.
putString(String) - Method in class org.apache.spark.util.sketch.BloomFilter: A specialized variant of BloomFilter.put(Object) that only supports String items.
putStringArray(String, String[]) - Method in class org.apache.spark.sql.types.MetadataBuilder: Puts a String array.
pValue() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
pValue() - Method in class org.apache.spark.mllib.stat.test.KolmogorovSmirnovTestResult
pValue() - Method in interface org.apache.spark.mllib.stat.test.TestResult: The probability of obtaining a test statistic result at least as extreme as the one that was actually observed, assuming that the null hypothesis is true.
pValues() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionTrainingSummary: Two-sided p-value of estimated coefficients and intercept.
pValues() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary: Two-sided p-value of estimated coefficients and intercept.
PythonStreamingListener - Interface in org.apache.spark.streaming.api.java
pyUDT() - Method in class org.apache.spark.mllib.linalg.VectorUDT

Q

Q() - Method in class org.apache.spark.mllib.linalg.QRDecomposition
QRDecomposition<QType,RType> - Class in org.apache.spark.mllib.linalg: Represents QR factors.
QRDecomposition(QType, RType) - Constructor for class org.apache.spark.mllib.linalg.QRDecomposition
quantileCalculationStrategy() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
QuantileDiscretizer - Class in org.apache.spark.ml.feature: QuantileDiscretizer takes a column with continuous features and outputs a column with binned categorical features.
QuantileDiscretizer(String) - Constructor for class org.apache.spark.ml.feature.QuantileDiscretizer
QuantileDiscretizer() - Constructor for class org.apache.spark.ml.feature.QuantileDiscretizer
QuantileDiscretizerBase - Interface in org.apache.spark.ml.feature: Params for QuantileDiscretizer.
quantileProbabilities() - Method in interface org.apache.spark.ml.regression.AFTSurvivalRegressionParams: Param for quantile probabilities array.
quantiles() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
quantilesCol() - Method in interface org.apache.spark.ml.regression.AFTSurvivalRegressionParams: Param for quantiles column name.
QuantileStrategy - Class in org.apache.spark.mllib.tree.configuration: Enum for selecting the quantile calculation strategy
QuantileStrategy() - Constructor for class org.apache.spark.mllib.tree.configuration.QuantileStrategy
quarter(Column) - Static method in class org.apache.spark.sql.functions: Extracts the quarter as an integer from a given date/timestamp/string.
query() - Method in class org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand
query() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveDirCommand
query() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
queryExecution() - Method in class org.apache.spark.sql.Dataset
queryExecution() - Method in class org.apache.spark.sql.KeyValueGroupedDataset
QueryExecutionListener - Interface in org.apache.spark.sql.util: :: Experimental :: The interface of query execution listener that can be used to analyze execution metrics.
queryName(String) - Method in class org.apache.spark.sql.streaming.DataStreamWriter: Specifies the name of the StreamingQuery that can be started with start().
queueStream(Queue<JavaRDD<T>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream from a queue of RDDs.
queueStream(Queue<JavaRDD<T>>, boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream from a queue of RDDs.
queueStream(Queue<JavaRDD<T>>, boolean, JavaRDD<T>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream from a queue of RDDs.
queueStream(Queue<RDD<T>>, boolean, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext: Create an input stream from a queue of RDDs.
queueStream(Queue<RDD<T>>, boolean, RDD<T>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext: Create an input stream from a queue of RDDs.
quot(Decimal, Decimal) - Method in class org.apache.spark.sql.types.Decimal.DecimalAsIfIntegral$
quoteIdentifier(String) - Method in class org.apache.spark.sql.jdbc.AggregatedDialect
quoteIdentifier(String) - Static method in class org.apache.spark.sql.jdbc.DB2Dialect
quoteIdentifier(String) - Static method in class org.apache.spark.sql.jdbc.DerbyDialect
quoteIdentifier(String) - Method in class org.apache.spark.sql.jdbc.JdbcDialect: Quotes the identifier.
quoteIdentifier(String) - Static method in class org.apache.spark.sql.jdbc.MsSqlServerDialect
quoteIdentifier(String) - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
quoteIdentifier(String) - Static method in class org.apache.spark.sql.jdbc.NoopDialect
quoteIdentifier(String) - Static method in class org.apache.spark.sql.jdbc.OracleDialect
quoteIdentifier(String) - Static method in class org.apache.spark.sql.jdbc.PostgresDialect
quoteIdentifier(String) - Static method in class org.apache.spark.sql.jdbc.TeradataDialect

R

R() - Method in class org.apache.spark.mllib.linalg.QRDecomposition
r2() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary: Returns R^2^, the coefficient of determination.
r2() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics: Returns R^2^, the unadjusted coefficient of determination.
r2adj() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary: Returns Adjusted R^2^, the adjusted coefficient of determination.
RACK_LOCAL() - Static method in class org.apache.spark.scheduler.TaskLocality
radians(Column) - Static method in class org.apache.spark.sql.functions: Converts an angle measured in degrees to an approximately equivalent angle measured in radians.
radians(String) - Static method in class org.apache.spark.sql.functions: Converts an angle measured in degrees to an approximately equivalent angle measured in radians.
rand(int, int, Random) - Static method in class org.apache.spark.ml.linalg.DenseMatrix: Generate a DenseMatrix consisting of i.i.d. uniform random numbers.
rand(int, int, Random) - Static method in class org.apache.spark.ml.linalg.Matrices: Generate a DenseMatrix consisting of i.i.d. uniform random numbers.
rand(int, int, Random) - Static method in class org.apache.spark.mllib.linalg.DenseMatrix: Generate a DenseMatrix consisting of i.i.d. uniform random numbers.
rand(int, int, Random) - Static method in class org.apache.spark.mllib.linalg.Matrices: Generate a DenseMatrix consisting of i.i.d. uniform random numbers.
rand(long) - Static method in class org.apache.spark.sql.functions: Generate a random column with independent and identically distributed (i.i.d.) samples from U[0.0, 1.0].
rand() - Static method in class org.apache.spark.sql.functions: Generate a random column with independent and identically distributed (i.i.d.) samples from U[0.0, 1.0].
randn(int, int, Random) - Static method in class org.apache.spark.ml.linalg.DenseMatrix: Generate a DenseMatrix consisting of i.i.d. gaussian random numbers.
randn(int, int, Random) - Static method in class org.apache.spark.ml.linalg.Matrices: Generate a DenseMatrix consisting of i.i.d. gaussian random numbers.
randn(int, int, Random) - Static method in class org.apache.spark.mllib.linalg.DenseMatrix: Generate a DenseMatrix consisting of i.i.d. gaussian random numbers.
randn(int, int, Random) - Static method in class org.apache.spark.mllib.linalg.Matrices: Generate a DenseMatrix consisting of i.i.d. gaussian random numbers.
randn(long) - Static method in class org.apache.spark.sql.functions: Generate a column with independent and identically distributed (i.i.d.) samples from the standard normal distribution.
randn() - Static method in class org.apache.spark.sql.functions: Generate a column with independent and identically distributed (i.i.d.) samples from the standard normal distribution.
random() - Method in class org.apache.spark.ml.image.SamplePathFilter
RANDOM() - Static method in class org.apache.spark.mllib.clustering.KMeans
random() - Static method in class org.apache.spark.util.Utils
RandomBlockReplicationPolicy - Class in org.apache.spark.storage
RandomBlockReplicationPolicy() - Constructor for class org.apache.spark.storage.RandomBlockReplicationPolicy
RandomDataGenerator<T> - Interface in org.apache.spark.mllib.random: :: DeveloperApi :: Trait for random data generators that generate i.i.d.
RandomForest - Class in org.apache.spark.ml.tree.impl: ALGORITHM
RandomForest() - Constructor for class org.apache.spark.ml.tree.impl.RandomForest
RandomForest - Class in org.apache.spark.mllib.tree: A class that implements a Random Forest learning algorithm for classification and regression.
RandomForest(Strategy, int, String, int) - Constructor for class org.apache.spark.mllib.tree.RandomForest
RandomForestClassificationModel - Class in org.apache.spark.ml.classification: Random Forest model for classification.
RandomForestClassifier - Class in org.apache.spark.ml.classification: Random Forest learning algorithm for classification.
RandomForestClassifier(String) - Constructor for class org.apache.spark.ml.classification.RandomForestClassifier
RandomForestClassifier() - Constructor for class org.apache.spark.ml.classification.RandomForestClassifier
RandomForestClassifierParams - Interface in org.apache.spark.ml.tree
RandomForestModel - Class in org.apache.spark.mllib.tree.model: Represents a random forest model.
RandomForestModel(Enumeration.Value, DecisionTreeModel[]) - Constructor for class org.apache.spark.mllib.tree.model.RandomForestModel
RandomForestParams - Interface in org.apache.spark.ml.tree: Parameters for Random Forest algorithms.
RandomForestRegressionModel - Class in org.apache.spark.ml.regression: Random Forest model for regression.
RandomForestRegressor - Class in org.apache.spark.ml.regression: Random Forest learning algorithm for regression.
RandomForestRegressor(String) - Constructor for class org.apache.spark.ml.regression.RandomForestRegressor
RandomForestRegressor() - Constructor for class org.apache.spark.ml.regression.RandomForestRegressor
RandomForestRegressorParams - Interface in org.apache.spark.ml.tree
randomize(TraversableOnce<T>, ClassTag<T>) - Static method in class org.apache.spark.util.Utils: Shuffle the elements of a collection into a random order, returning the result in a new collection.
randomizeInPlace(Object, Random) - Static method in class org.apache.spark.util.Utils: Shuffle the elements of an array into a random order, modifying the original array.
randomJavaRDD(JavaSparkContext, RandomDataGenerator<T>, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: :: DeveloperApi :: Generates an RDD comprised of i.i.d. samples produced by the input RandomDataGenerator.
randomJavaRDD(JavaSparkContext, RandomDataGenerator<T>, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: :: DeveloperApi :: RandomRDDs.randomJavaRDD with the default seed.
randomJavaRDD(JavaSparkContext, RandomDataGenerator<T>, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: :: DeveloperApi :: RandomRDDs.randomJavaRDD with the default seed & numPartitions
randomJavaVectorRDD(JavaSparkContext, RandomDataGenerator<Object>, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: :: DeveloperApi :: Java-friendly version of RandomRDDs.randomVectorRDD.
randomJavaVectorRDD(JavaSparkContext, RandomDataGenerator<Object>, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: :: DeveloperApi :: RandomRDDs.randomJavaVectorRDD with the default seed.
randomJavaVectorRDD(JavaSparkContext, RandomDataGenerator<Object>, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: :: DeveloperApi :: RandomRDDs.randomJavaVectorRDD with the default number of partitions and the default seed.
randomRDD(SparkContext, RandomDataGenerator<T>, long, int, long, ClassTag<T>) - Static method in class org.apache.spark.mllib.random.RandomRDDs: :: DeveloperApi :: Generates an RDD comprised of i.i.d. samples produced by the input RandomDataGenerator.
RandomRDDs - Class in org.apache.spark.mllib.random: Generator methods for creating RDDs comprised of i.i.d. samples from some distribution.
RandomRDDs() - Constructor for class org.apache.spark.mllib.random.RandomRDDs
RandomSampler<T,U> - Interface in org.apache.spark.util.random: :: DeveloperApi :: A pseudorandom sampler.
randomSplit(double[]) - Method in class org.apache.spark.api.java.JavaRDD: Randomly splits this RDD with the provided weights.
randomSplit(double[], long) - Method in class org.apache.spark.api.java.JavaRDD: Randomly splits this RDD with the provided weights.
randomSplit(double[], long) - Method in class org.apache.spark.rdd.RDD: Randomly splits this RDD with the provided weights.
randomSplit(double[], long) - Method in class org.apache.spark.sql.Dataset: Randomly splits this Dataset with the provided weights.
randomSplit(double[]) - Method in class org.apache.spark.sql.Dataset: Randomly splits this Dataset with the provided weights.
randomSplitAsList(double[], long) - Method in class org.apache.spark.sql.Dataset: Returns a Java list that contains randomly split Dataset with the provided weights.
randomVectorRDD(SparkContext, RandomDataGenerator<Object>, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: :: DeveloperApi :: Generates an RDD[Vector] with vectors containing i.i.d. samples produced by the input RandomDataGenerator.
RandomVertexCut$() - Constructor for class org.apache.spark.graphx.PartitionStrategy.RandomVertexCut$
range(long, long, long, int) - Method in class org.apache.spark.SparkContext: Creates a new RDD[Long] containing elements from start to end(exclusive), increased by step every element.
range(long) - Method in class org.apache.spark.sql.SparkSession: :: Experimental :: Creates a Dataset with a single LongType column named id, containing elements in a range from 0 to end (exclusive) with step value 1.
range(long, long) - Method in class org.apache.spark.sql.SparkSession: :: Experimental :: Creates a Dataset with a single LongType column named id, containing elements in a range from start to end (exclusive) with step value 1.
range(long, long, long) - Method in class org.apache.spark.sql.SparkSession: :: Experimental :: Creates a Dataset with a single LongType column named id, containing elements in a range from start to end (exclusive) with a step value.
range(long, long, long, int) - Method in class org.apache.spark.sql.SparkSession: :: Experimental :: Creates a Dataset with a single LongType column named id, containing elements in a range from start to end (exclusive) with a step value, with partition number specified.
range(long) - Method in class org.apache.spark.sql.SQLContext
range(long, long) - Method in class org.apache.spark.sql.SQLContext
range(long, long, long) - Method in class org.apache.spark.sql.SQLContext
range(long, long, long, int) - Method in class org.apache.spark.sql.SQLContext
rangeBetween(long, long) - Static method in class org.apache.spark.sql.expressions.Window: Creates a WindowSpec with the frame boundaries defined, from start (inclusive) to end (inclusive).
rangeBetween(Column, Column) - Static method in class org.apache.spark.sql.expressions.Window: Deprecated.
Use the version with Long parameter types. Since 2.4.0.
rangeBetween(long, long) - Method in class org.apache.spark.sql.expressions.WindowSpec: Defines the frame boundaries, from start (inclusive) to end (inclusive).
rangeBetween(Column, Column) - Method in class org.apache.spark.sql.expressions.WindowSpec: Deprecated.
Use the version with Long parameter types. Since 2.4.0.
RangeDependency<T> - Class in org.apache.spark: :: DeveloperApi :: Represents a one-to-one dependency between ranges of partitions in the parent and child RDDs.
RangeDependency(RDD<T>, int, int, int) - Constructor for class org.apache.spark.RangeDependency
RangePartitioner<K,V> - Class in org.apache.spark: A Partitioner that partitions sortable records by range into roughly equal ranges.
RangePartitioner(int, RDD<? extends Product2<K, V>>, boolean, int, Ordering<K>, ClassTag<K>) - Constructor for class org.apache.spark.RangePartitioner
RangePartitioner(int, RDD<? extends Product2<K, V>>, boolean, Ordering<K>, ClassTag<K>) - Constructor for class org.apache.spark.RangePartitioner
rank() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
rank() - Method in class org.apache.spark.ml.recommendation.ALSModel
rank() - Method in interface org.apache.spark.ml.recommendation.ALSParams: Param for rank of the matrix factorization (positive).
rank() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionSummary: The numeric rank of the fitted linear model.
rank() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
rank() - Static method in class org.apache.spark.sql.functions: Window function: returns the rank of rows within a window partition.
RankingMetrics<T> - Class in org.apache.spark.mllib.evaluation: Evaluator for ranking algorithms.
RankingMetrics(RDD<Tuple2<Object, Object>>, ClassTag<T>) - Constructor for class org.apache.spark.mllib.evaluation.RankingMetrics
RateEstimator - Interface in org.apache.spark.streaming.scheduler.rate: A component that estimates the rate at which an InputDStream should ingest records, based on updates at every batch completion.
Rating(ID, ID, float) - Constructor for class org.apache.spark.ml.recommendation.ALS.Rating
rating() - Method in class org.apache.spark.ml.recommendation.ALS.Rating
Rating - Class in org.apache.spark.mllib.recommendation: A more compact class to represent a rating than Tuple3[Int, Int, Double].
Rating(int, int, double) - Constructor for class org.apache.spark.mllib.recommendation.Rating
rating() - Method in class org.apache.spark.mllib.recommendation.Rating
Rating$() - Constructor for class org.apache.spark.ml.recommendation.ALS.Rating$
RatingBlock$() - Constructor for class org.apache.spark.ml.recommendation.ALS.RatingBlock$
ratingCol() - Method in interface org.apache.spark.ml.recommendation.ALSParams: Param for the column name for ratings.
ratioParam() - Static method in class org.apache.spark.ml.image.SamplePathFilter
raw2ProbabilityInPlace(Vector) - Method in interface org.apache.spark.ml.ann.TopologyModel: Probability of the model.
rawPredictionCol() - Method in interface org.apache.spark.ml.param.shared.HasRawPredictionCol: Param for raw prediction (a.k.a.
rawSocketStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream from network source hostname:port, where data is received as serialized blocks (serialized using the Spark's serializer) that can be directly pushed into the block manager without deserializing them.
rawSocketStream(String, int) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream from network source hostname:port, where data is received as serialized blocks (serialized using the Spark's serializer) that can be directly pushed into the block manager without deserializing them.
rawSocketStream(String, int, StorageLevel, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext: Create an input stream from network source hostname:port, where data is received as serialized blocks (serialized using the Spark's serializer) that can be directly pushed into the block manager without deserializing them.
RawTextHelper - Class in org.apache.spark.streaming.util
RawTextHelper() - Constructor for class org.apache.spark.streaming.util.RawTextHelper
RawTextSender - Class in org.apache.spark.streaming.util: A helper program that sends blocks of Kryo-serialized text strings out on a socket at a specified rate.
RawTextSender() - Constructor for class org.apache.spark.streaming.util.RawTextSender
RBackendAuthHandler - Class in org.apache.spark.api.r: Authentication handler for connections from the R process.
RBackendAuthHandler(String) - Constructor for class org.apache.spark.api.r.RBackendAuthHandler
rdd() - Method in class org.apache.spark.api.java.JavaDoubleRDD
rdd() - Method in class org.apache.spark.api.java.JavaPairRDD
rdd() - Method in class org.apache.spark.api.java.JavaRDD
rdd() - Method in interface org.apache.spark.api.java.JavaRDDLike
RDD() - Static method in class org.apache.spark.api.r.RRunnerModes
rdd() - Method in class org.apache.spark.Dependency
rdd() - Method in class org.apache.spark.NarrowDependency
RDD<T> - Class in org.apache.spark.rdd: A Resilient Distributed Dataset (RDD), the basic abstraction in Spark.
RDD(SparkContext, Seq<Dependency<?>>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.RDD
RDD(RDD<?>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.RDD: Construct an RDD with just a one-to-one dependency on one parent
rdd() - Method in class org.apache.spark.ShuffleDependency
rdd() - Method in class org.apache.spark.sql.Dataset: Represents the content of the Dataset as an RDD of T.
RDD() - Static method in class org.apache.spark.storage.BlockId
RDDBarrier<T> - Class in org.apache.spark.rdd: :: Experimental :: Wraps an RDD in a barrier stage, which forces Spark to launch tasks of this stage together.
RDDBlockId - Class in org.apache.spark.storage
RDDBlockId(int, int) - Constructor for class org.apache.spark.storage.RDDBlockId
rddBlocks() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
rddBlocks() - Method in class org.apache.spark.status.LiveExecutor
rddCleaned(int) - Method in interface org.apache.spark.CleanerListener
RDDDataDistribution - Class in org.apache.spark.status.api.v1
RDDFunctions<T> - Class in org.apache.spark.mllib.rdd: :: DeveloperApi :: Machine learning specific RDD functions.
RDDFunctions(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.mllib.rdd.RDDFunctions
rddId() - Method in class org.apache.spark.CleanCheckpoint
rddId() - Method in class org.apache.spark.CleanRDD
rddId() - Method in class org.apache.spark.scheduler.SparkListenerUnpersistRDD
rddId() - Method in class org.apache.spark.storage.BlockManagerMessages.RemoveRdd
rddId() - Method in class org.apache.spark.storage.RDDBlockId
rddIds() - Method in class org.apache.spark.status.api.v1.StageData
RDDInfo - Class in org.apache.spark.storage
RDDInfo(int, String, int, StorageLevel, Seq<Object>, String, Option<org.apache.spark.rdd.RDDOperationScope>) - Constructor for class org.apache.spark.storage.RDDInfo
rddInfoFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
rddInfos() - Method in class org.apache.spark.scheduler.StageInfo
rddInfoToJson(RDDInfo) - Static method in class org.apache.spark.util.JsonProtocol
RDDPartitionInfo - Class in org.apache.spark.status.api.v1
RDDPartitionSeq - Class in org.apache.spark.status: A custom sequence of partitions based on a mutable linked list.
RDDPartitionSeq() - Constructor for class org.apache.spark.status.RDDPartitionSeq
rdds() - Method in class org.apache.spark.rdd.CoGroupedRDD
rdds() - Method in class org.apache.spark.rdd.UnionRDD
RDDStorageInfo - Class in org.apache.spark.status.api.v1
rddToAsyncRDDActions(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.rdd.RDD
rddToDatasetHolder(RDD<T>, Encoder<T>) - Method in class org.apache.spark.sql.SQLImplicits: Creates a Dataset from an RDD.
rddToOrderedRDDFunctions(RDD<Tuple2<K, V>>, Ordering<K>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.rdd.RDD
rddToPairRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Static method in class org.apache.spark.rdd.RDD
rddToSequenceFileRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, <any>, <any>) - Static method in class org.apache.spark.rdd.RDD
read() - Method in class org.apache.spark.io.NioBufferedFileInputStream
read(byte[], int, int) - Method in class org.apache.spark.io.NioBufferedFileInputStream
read() - Method in class org.apache.spark.io.ReadAheadInputStream
read(byte[], int, int) - Method in class org.apache.spark.io.ReadAheadInputStream
read() - Static method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
read() - Static method in class org.apache.spark.ml.classification.DecisionTreeClassifier
read() - Static method in class org.apache.spark.ml.classification.GBTClassificationModel
read() - Static method in class org.apache.spark.ml.classification.GBTClassifier
read() - Static method in class org.apache.spark.ml.classification.LinearSVC
read() - Static method in class org.apache.spark.ml.classification.LinearSVCModel
read() - Static method in class org.apache.spark.ml.classification.LogisticRegression
read() - Static method in class org.apache.spark.ml.classification.LogisticRegressionModel
read() - Static method in class org.apache.spark.ml.classification.MultilayerPerceptronClassificationModel
read() - Static method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier
read() - Static method in class org.apache.spark.ml.classification.NaiveBayes
read() - Static method in class org.apache.spark.ml.classification.NaiveBayesModel
read() - Static method in class org.apache.spark.ml.classification.OneVsRest
read() - Static method in class org.apache.spark.ml.classification.OneVsRestModel
read() - Static method in class org.apache.spark.ml.classification.RandomForestClassificationModel
read() - Static method in class org.apache.spark.ml.classification.RandomForestClassifier
read() - Static method in class org.apache.spark.ml.clustering.BisectingKMeans
read() - Static method in class org.apache.spark.ml.clustering.BisectingKMeansModel
read() - Static method in class org.apache.spark.ml.clustering.DistributedLDAModel
read() - Static method in class org.apache.spark.ml.clustering.GaussianMixture
read() - Static method in class org.apache.spark.ml.clustering.GaussianMixtureModel
read() - Static method in class org.apache.spark.ml.clustering.KMeans
read() - Static method in class org.apache.spark.ml.clustering.KMeansModel
read() - Static method in class org.apache.spark.ml.clustering.LDA
read() - Static method in class org.apache.spark.ml.clustering.LocalLDAModel
read() - Static method in class org.apache.spark.ml.clustering.PowerIterationClustering
read() - Static method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
read() - Static method in class org.apache.spark.ml.evaluation.ClusteringEvaluator
read() - Static method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
read() - Static method in class org.apache.spark.ml.evaluation.RegressionEvaluator
read() - Static method in class org.apache.spark.ml.feature.Binarizer
read() - Static method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSH
read() - Static method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSHModel
read() - Static method in class org.apache.spark.ml.feature.Bucketizer
read() - Static method in class org.apache.spark.ml.feature.ChiSqSelector
read() - Static method in class org.apache.spark.ml.feature.ChiSqSelectorModel
read() - Static method in class org.apache.spark.ml.feature.ColumnPruner
read() - Static method in class org.apache.spark.ml.feature.CountVectorizer
read() - Static method in class org.apache.spark.ml.feature.CountVectorizerModel
read() - Static method in class org.apache.spark.ml.feature.DCT
read() - Static method in class org.apache.spark.ml.feature.ElementwiseProduct
read() - Static method in class org.apache.spark.ml.feature.FeatureHasher
read() - Static method in class org.apache.spark.ml.feature.HashingTF
read() - Static method in class org.apache.spark.ml.feature.IDF
read() - Static method in class org.apache.spark.ml.feature.IDFModel
read() - Static method in class org.apache.spark.ml.feature.Imputer
read() - Static method in class org.apache.spark.ml.feature.ImputerModel
read() - Static method in class org.apache.spark.ml.feature.IndexToString
read() - Static method in class org.apache.spark.ml.feature.Interaction
read() - Static method in class org.apache.spark.ml.feature.MaxAbsScaler
read() - Static method in class org.apache.spark.ml.feature.MaxAbsScalerModel
read() - Static method in class org.apache.spark.ml.feature.MinHashLSH
read() - Static method in class org.apache.spark.ml.feature.MinHashLSHModel
read() - Static method in class org.apache.spark.ml.feature.MinMaxScaler
read() - Static method in class org.apache.spark.ml.feature.MinMaxScalerModel
read() - Static method in class org.apache.spark.ml.feature.NGram
read() - Static method in class org.apache.spark.ml.feature.Normalizer
read() - Static method in class org.apache.spark.ml.feature.OneHotEncoder: Deprecated.
read() - Static method in class org.apache.spark.ml.feature.OneHotEncoderEstimator
read() - Static method in class org.apache.spark.ml.feature.OneHotEncoderModel
read() - Static method in class org.apache.spark.ml.feature.PCA
read() - Static method in class org.apache.spark.ml.feature.PCAModel
read() - Static method in class org.apache.spark.ml.feature.PolynomialExpansion
read() - Static method in class org.apache.spark.ml.feature.QuantileDiscretizer
read() - Static method in class org.apache.spark.ml.feature.RegexTokenizer
read() - Static method in class org.apache.spark.ml.feature.RFormula
read() - Static method in class org.apache.spark.ml.feature.RFormulaModel
read() - Static method in class org.apache.spark.ml.feature.SQLTransformer
read() - Static method in class org.apache.spark.ml.feature.StandardScaler
read() - Static method in class org.apache.spark.ml.feature.StandardScalerModel
read() - Static method in class org.apache.spark.ml.feature.StopWordsRemover
read() - Static method in class org.apache.spark.ml.feature.StringIndexer
read() - Static method in class org.apache.spark.ml.feature.StringIndexerModel
read() - Static method in class org.apache.spark.ml.feature.Tokenizer
read() - Static method in class org.apache.spark.ml.feature.VectorAssembler
read() - Static method in class org.apache.spark.ml.feature.VectorAttributeRewriter
read() - Static method in class org.apache.spark.ml.feature.VectorIndexer
read() - Static method in class org.apache.spark.ml.feature.VectorIndexerModel
read() - Static method in class org.apache.spark.ml.feature.VectorSizeHint
read() - Static method in class org.apache.spark.ml.feature.VectorSlicer
read() - Static method in class org.apache.spark.ml.feature.Word2Vec
read() - Static method in class org.apache.spark.ml.feature.Word2VecModel
read() - Static method in class org.apache.spark.ml.fpm.FPGrowth
read() - Static method in class org.apache.spark.ml.fpm.FPGrowthModel
read() - Static method in class org.apache.spark.ml.Pipeline
read() - Static method in class org.apache.spark.ml.PipelineModel
read() - Static method in class org.apache.spark.ml.recommendation.ALS
read() - Static method in class org.apache.spark.ml.recommendation.ALSModel
read() - Static method in class org.apache.spark.ml.regression.AFTSurvivalRegression
read() - Static method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
read() - Static method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
read() - Static method in class org.apache.spark.ml.regression.DecisionTreeRegressor
read() - Static method in class org.apache.spark.ml.regression.GBTRegressionModel
read() - Static method in class org.apache.spark.ml.regression.GBTRegressor
read() - Static method in class org.apache.spark.ml.regression.GeneralizedLinearRegression
read() - Static method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionModel
read() - Static method in class org.apache.spark.ml.regression.IsotonicRegression
read() - Static method in class org.apache.spark.ml.regression.IsotonicRegressionModel
read() - Static method in class org.apache.spark.ml.regression.LinearRegression
read() - Static method in class org.apache.spark.ml.regression.LinearRegressionModel
read() - Static method in class org.apache.spark.ml.regression.RandomForestRegressionModel
read() - Static method in class org.apache.spark.ml.regression.RandomForestRegressor
read() - Static method in class org.apache.spark.ml.tuning.CrossValidator
read() - Static method in class org.apache.spark.ml.tuning.CrossValidatorModel
read() - Static method in class org.apache.spark.ml.tuning.TrainValidationSplit
read() - Static method in class org.apache.spark.ml.tuning.TrainValidationSplitModel
read() - Method in interface org.apache.spark.ml.util.DefaultParamsReadable
read() - Method in interface org.apache.spark.ml.util.MLReadable: Returns an MLReader instance for this class.
read(ByteBuffer) - Method in class org.apache.spark.security.CryptoStreamUtils.ErrorHandlingReadableChannel
read(Kryo, Input, Class<Iterable<?>>) - Method in class org.apache.spark.serializer.JavaIterableWrapperSerializer
read() - Method in class org.apache.spark.sql.SparkSession: Returns a DataFrameReader that can be used to read non-streaming data in as a DataFrame.
read() - Method in class org.apache.spark.sql.SQLContext
read() - Method in class org.apache.spark.storage.BufferReleasingInputStream
read(byte[]) - Method in class org.apache.spark.storage.BufferReleasingInputStream
read(byte[], int, int) - Method in class org.apache.spark.storage.BufferReleasingInputStream
read(String) - Static method in class org.apache.spark.streaming.CheckpointReader: Read checkpoint files present in the given checkpoint directory.
read(String, SparkConf, Configuration, boolean) - Static method in class org.apache.spark.streaming.CheckpointReader: Read checkpoint files present in the given checkpoint directory.
read(WriteAheadLogRecordHandle) - Method in class org.apache.spark.streaming.util.WriteAheadLog: Read a written record based on the given record handle.
ReadableChannelFileRegion - Class in org.apache.spark.storage
ReadableChannelFileRegion(ReadableByteChannel, long) - Constructor for class org.apache.spark.storage.ReadableChannelFileRegion
ReadAheadInputStream - Class in org.apache.spark.io: InputStream implementation which asynchronously reads ahead from the underlying input stream when specified amount of data has been read from the current buffer.
ReadAheadInputStream(InputStream, int) - Constructor for class org.apache.spark.io.ReadAheadInputStream: Creates a ReadAheadInputStream with the specified buffer size and read-ahead threshold
readAll() - Method in class org.apache.spark.streaming.util.WriteAheadLog: Read and return an iterator of all the records that have been written but not yet cleaned up.
readArray(DataInputStream, JVMObjectTracker) - Static method in class org.apache.spark.api.r.SerDe
readBoolean(DataInputStream) - Static method in class org.apache.spark.api.r.SerDe
readBooleanArr(DataInputStream) - Static method in class org.apache.spark.api.r.SerDe
readBytes(DataInputStream) - Static method in class org.apache.spark.api.r.SerDe
readBytes() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetricDistributions
readBytesArr(DataInputStream) - Static method in class org.apache.spark.api.r.SerDe
readDate(DataInputStream) - Static method in class org.apache.spark.api.r.SerDe
readDouble(DataInputStream) - Static method in class org.apache.spark.api.r.SerDe
readDoubleArr(DataInputStream) - Static method in class org.apache.spark.api.r.SerDe
readExternal(ObjectInput) - Method in class org.apache.spark.serializer.JavaSerializer
readExternal(ObjectInput) - Method in class org.apache.spark.storage.BlockManagerId
readExternal(ObjectInput) - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
readExternal(ObjectInput) - Method in class org.apache.spark.storage.StorageLevel
readFrom(ConfigReader) - Method in class org.apache.spark.internal.config.ConfigEntryWithDefault
readFrom(ConfigReader) - Method in class org.apache.spark.internal.config.ConfigEntryWithDefaultFunction
readFrom(ConfigReader) - Method in class org.apache.spark.internal.config.ConfigEntryWithDefaultString
readFrom(InputStream) - Static method in class org.apache.spark.util.sketch.BloomFilter: Reads in a BloomFilter from an input stream.
readFrom(InputStream) - Static method in class org.apache.spark.util.sketch.CountMinSketch: Reads in a CountMinSketch from an input stream.
readFrom(byte[]) - Static method in class org.apache.spark.util.sketch.CountMinSketch: Reads in a CountMinSketch from a byte array.
readImages(String) - Static method in class org.apache.spark.ml.image.ImageSchema: Deprecated.
use `spark.read.format("image").load(path)` and this `readImages` will be removed in 3.0.0. Since 2.4.0.
readImages(String, SparkSession, boolean, int, boolean, double, long) - Static method in class org.apache.spark.ml.image.ImageSchema: Deprecated.
use `spark.read.format("image").load(path)` and this `readImages` will be removed in 3.0.0. Since 2.4.0.
readInt(DataInputStream) - Static method in class org.apache.spark.api.r.SerDe
readIntArr(DataInputStream) - Static method in class org.apache.spark.api.r.SerDe
readKey(ClassTag<T>) - Method in class org.apache.spark.serializer.DeserializationStream: Reads the object representing the key of a key-value pair.
readList(DataInputStream, JVMObjectTracker) - Static method in class org.apache.spark.api.r.SerDe
readMap(DataInputStream, JVMObjectTracker) - Static method in class org.apache.spark.api.r.SerDe
readObject(DataInputStream, JVMObjectTracker) - Static method in class org.apache.spark.api.r.SerDe
readObject(ClassTag<T>) - Method in class org.apache.spark.serializer.DeserializationStream: The most general-purpose method to read an object.
readObjectType(DataInputStream) - Static method in class org.apache.spark.api.r.SerDe
readRecords() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetricDistributions
readSchema(Seq<String>, Option<Configuration>, boolean) - Static method in class org.apache.spark.sql.hive.orc.OrcFileOperator
readSchema() - Method in interface org.apache.spark.sql.sources.v2.reader.DataSourceReader: Returns the actual schema of this data source reader, which may be different from the physical schema of the underlying storage, as column pruning or other optimizations may happen.
readSqlObject(DataInputStream, char) - Static method in class org.apache.spark.sql.api.r.SQLUtils
readStream() - Method in class org.apache.spark.sql.SparkSession: Returns a DataStreamReader that can be used to read streaming data in as a DataFrame.
readStream() - Method in class org.apache.spark.sql.SQLContext
readString(DataInputStream) - Static method in class org.apache.spark.api.r.SerDe
readStringArr(DataInputStream) - Static method in class org.apache.spark.api.r.SerDe
readStringBytes(DataInputStream, int) - Static method in class org.apache.spark.api.r.SerDe
ReadSupport - Interface in org.apache.spark.sql.sources.v2: A mix-in interface for DataSourceV2.
readTime(DataInputStream) - Static method in class org.apache.spark.api.r.SerDe
readTypedObject(DataInputStream, char, JVMObjectTracker) - Static method in class org.apache.spark.api.r.SerDe
readValue(ClassTag<T>) - Method in class org.apache.spark.serializer.DeserializationStream: Reads the object representing the value of a key-value pair.
ready(Duration, CanAwait) - Method in class org.apache.spark.ComplexFutureAction
ready(Duration, CanAwait) - Method in interface org.apache.spark.FutureAction: Blocks until this action completes.
ready(Duration, CanAwait) - Method in class org.apache.spark.SimpleFutureAction
reason() - Method in class org.apache.spark.ExecutorLostFailure
reason() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillTask
reason() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveExecutor
reason() - Method in class org.apache.spark.scheduler.local.KillTask
reason() - Method in class org.apache.spark.scheduler.SparkListenerExecutorRemoved
reason() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
reason() - Method in class org.apache.spark.TaskKilled
reason() - Method in exception org.apache.spark.TaskKilledException
Recall - Class in org.apache.spark.mllib.evaluation.binary: Recall.
Recall() - Constructor for class org.apache.spark.mllib.evaluation.binary.Recall
recall(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns recall for a given label (category)
recall() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Deprecated.
Use accuracy. Since 2.0.0.
recall() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics: Returns document-based recall averaged by the number of documents
recall(double) - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics: Returns recall for a given label (category)
recallByLabel() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary: Returns recall for each label (category).
recallByThreshold() - Method in interface org.apache.spark.ml.classification.BinaryLogisticRegressionSummary: Returns a dataframe with two fields (threshold, recall) curve.
recallByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Returns the (threshold, recall) curve.
receive() - Method in interface org.apache.spark.rpc.RpcEndpoint: Process messages from RpcEndpointRef.send or RpcCallContext.reply.
receiveAndReply(RpcCallContext) - Method in interface org.apache.spark.rpc.RpcEndpoint: Process messages from RpcEndpointRef.ask.
ReceivedBlock - Interface in org.apache.spark.streaming.receiver: Trait representing a received block
ReceivedBlockHandler - Interface in org.apache.spark.streaming.receiver: Trait that represents a class that handles the storage of blocks received by receiver
ReceivedBlockStoreResult - Interface in org.apache.spark.streaming.receiver: Trait that represents the metadata related to storage of blocks
ReceivedBlockTrackerLogEvent - Interface in org.apache.spark.streaming.scheduler: Trait representing any event in the ReceivedBlockTracker that updates its state.
Receiver<T> - Class in org.apache.spark.streaming.receiver: :: DeveloperApi :: Abstract class of a receiver that can be run on worker nodes to receive external data.
Receiver(StorageLevel) - Constructor for class org.apache.spark.streaming.receiver.Receiver
RECEIVER_WAL_CLASS_CONF_KEY() - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
RECEIVER_WAL_CLOSE_AFTER_WRITE_CONF_KEY() - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
RECEIVER_WAL_ENABLE_CONF_KEY() - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
RECEIVER_WAL_MAX_FAILURES_CONF_KEY() - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
RECEIVER_WAL_ROLLING_INTERVAL_CONF_KEY() - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
ReceiverInfo - Class in org.apache.spark.status.api.v1.streaming
ReceiverInfo - Class in org.apache.spark.streaming.scheduler: :: DeveloperApi :: Class having information about a receiver
ReceiverInfo(int, String, boolean, String, String, String, String, long) - Constructor for class org.apache.spark.streaming.scheduler.ReceiverInfo
receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverError
receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStarted
receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStopped
receiverInputDStream() - Method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
receiverInputDStream() - Method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
ReceiverInputDStream<T> - Class in org.apache.spark.streaming.dstream: Abstract class for defining any InputDStream that has to start a receiver on worker nodes to receive external data.
ReceiverInputDStream(StreamingContext, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.ReceiverInputDStream
ReceiverMessage - Interface in org.apache.spark.streaming.receiver: Messages sent to the Receiver.
ReceiverState - Class in org.apache.spark.streaming.scheduler: Enumeration to identify current state of a Receiver
ReceiverState() - Constructor for class org.apache.spark.streaming.scheduler.ReceiverState
receiverStream(Receiver<T>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream with any arbitrary user implemented receiver.
receiverStream(Receiver<T>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext: Create an input stream with any arbitrary user implemented receiver.
ReceiverTrackerLocalMessage - Interface in org.apache.spark.streaming.scheduler: Messages used by the driver and ReceiverTrackerEndpoint to communicate locally.
ReceiverTrackerMessage - Interface in org.apache.spark.streaming.scheduler: Messages used by the NetworkReceiver and the ReceiverTracker to communicate with each other.
recentProgress() - Method in interface org.apache.spark.sql.streaming.StreamingQuery: Returns an array of the most recent StreamingQueryProgress updates for this query.
recommendForAllItems(int) - Method in class org.apache.spark.ml.recommendation.ALSModel: Returns top numUsers users recommended for each item, for all items.
recommendForAllUsers(int) - Method in class org.apache.spark.ml.recommendation.ALSModel: Returns top numItems items recommended for each user, for all users.
recommendForItemSubset(Dataset<?>, int) - Method in class org.apache.spark.ml.recommendation.ALSModel: Returns top numUsers users recommended for each item id in the input data set.
recommendForUserSubset(Dataset<?>, int) - Method in class org.apache.spark.ml.recommendation.ALSModel: Returns top numItems items recommended for each user id in the input data set.
recommendProducts(int, int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel: Recommends products to a user.
recommendProductsForUsers(int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel: Recommends top products for all users.
recommendUsers(int, int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel: Recommends users to a product.
recommendUsersForProducts(int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel: Recommends top users for all products.
recordReader(InputStream, Configuration) - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
recordReaderClass() - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
RECORDS_BETWEEN_BYTES_READ_METRIC_UPDATES() - Static method in class org.apache.spark.rdd.HadoopRDD: Update the input bytes read metric each time this number of records has been read
RECORDS_READ() - Method in class org.apache.spark.InternalAccumulator.input$
RECORDS_READ() - Method in class org.apache.spark.InternalAccumulator.shuffleRead$
RECORDS_WRITTEN() - Method in class org.apache.spark.InternalAccumulator.output$
RECORDS_WRITTEN() - Method in class org.apache.spark.InternalAccumulator.shuffleWrite$
recordsRead() - Method in class org.apache.spark.status.api.v1.InputMetricDistributions
recordsRead() - Method in class org.apache.spark.status.api.v1.InputMetrics
recordsRead() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetrics
recordsWritten() - Method in class org.apache.spark.status.api.v1.OutputMetricDistributions
recordsWritten() - Method in class org.apache.spark.status.api.v1.OutputMetrics
recordsWritten() - Method in class org.apache.spark.status.api.v1.ShuffleWriteMetrics
recordWriter(OutputStream, Configuration) - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
recordWriterClass() - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
recoverPartitions(String) - Method in class org.apache.spark.sql.catalog.Catalog: Recovers all the partitions in the directory of a table and update the catalog.
RecursiveFlag - Class in org.apache.spark.ml.image
RecursiveFlag() - Constructor for class org.apache.spark.ml.image.RecursiveFlag
recursiveList(File) - Static method in class org.apache.spark.TestUtils: Lists files recursively.
redact(SparkConf, Seq<Tuple2<String, String>>) - Static method in class org.apache.spark.util.Utils: Redact the sensitive values in the given map.
redact(Option<Regex>, Seq<Tuple2<String, String>>) - Static method in class org.apache.spark.util.Utils: Redact the sensitive values in the given map.
redact(Option<Regex>, String) - Static method in class org.apache.spark.util.Utils: Redact the sensitive information in the given string.
redact(Map<String, String>) - Static method in class org.apache.spark.util.Utils: Looks up the redaction regex from within the key value pairs and uses it to redact the rest of the key value pairs.
REDIRECT_CONNECTOR_NAME() - Static method in class org.apache.spark.ui.JettyUtils
redirectableStream() - Method in class org.apache.spark.storage.memory.SerializedValuesHolder
redirectError() - Method in class org.apache.spark.launcher.SparkLauncher: Specifies that stderr in spark-submit should be redirected to stdout.
redirectError(ProcessBuilder.Redirect) - Method in class org.apache.spark.launcher.SparkLauncher: Redirects error output to the specified Redirect.
redirectError(File) - Method in class org.apache.spark.launcher.SparkLauncher: Redirects error output to the specified File.
redirectOutput(ProcessBuilder.Redirect) - Method in class org.apache.spark.launcher.SparkLauncher: Redirects standard output to the specified Redirect.
redirectOutput(File) - Method in class org.apache.spark.launcher.SparkLauncher: Redirects error output to the specified File.
redirectToLog(String) - Method in class org.apache.spark.launcher.SparkLauncher: Sets all output to be logged and redirected to a logger with the specified name.
reduce(Function2<T, T, T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Reduces the elements of this RDD using the specified commutative and associative binary operator.
reduce(Function2<T, T, T>) - Method in class org.apache.spark.rdd.RDD: Reduces the elements of this RDD using the specified commutative and associative binary operator.
reduce(Function2<T, T, T>) - Method in class org.apache.spark.sql.Dataset: :: Experimental :: (Scala-specific) Reduces the elements of this Dataset using the specified binary function.
reduce(ReduceFunction<T>) - Method in class org.apache.spark.sql.Dataset: :: Experimental :: (Java-specific) Reduces the elements of this Dataset using the specified binary function.
reduce(BUF, IN) - Method in class org.apache.spark.sql.expressions.Aggregator: Combine two values to produce a new value.
reduce(Function2<T, T, T>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD has a single element generated by reducing each RDD of this DStream.
reduce(Function2<T, T, T>) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD has a single element generated by reducing each RDD of this DStream.
reduceByKey(Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD: Merge the values for each key using an associative and commutative reduce function.
reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Merge the values for each key using an associative and commutative reduce function.
reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD: Merge the values for each key using an associative and commutative reduce function.
reduceByKey(Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Merge the values for each key using an associative and commutative reduce function.
reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: Merge the values for each key using an associative and commutative reduce function.
reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Merge the values for each key using an associative and commutative reduce function.
reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying reduceByKey to each RDD.
reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying reduceByKey to each RDD.
reduceByKey(Function2<V, V, V>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying reduceByKey to each RDD.
reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying reduceByKey to each RDD.
reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying reduceByKey to each RDD.
reduceByKey(Function2<V, V, V>, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying reduceByKey to each RDD.
reduceByKeyAndWindow(Function2<V, V, V>, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Create a new DStream by applying reduceByKey over a sliding window on this DStream.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by reducing over a using incremental computation.
reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, int, Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying incremental reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, Partitioner, Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying incremental reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying reduceByKey over a sliding window on this DStream.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, int, Function1<Tuple2<K, V>, Object>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying incremental reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, Partitioner, Function1<Tuple2<K, V>, Object>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying incremental reduceByKey over a sliding window.
reduceByKeyLocally(Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD: Merge the values for each key using an associative and commutative reduce function, but return the result immediately to the master as a Map.
reduceByKeyLocally(Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Merge the values for each key using an associative and commutative reduce function, but return the results immediately to the master as a Map.
reduceByWindow(Function2<T, T, T>, Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD has a single element generated by reducing all elements in a sliding window over this DStream.
reduceByWindow(Function2<T, T, T>, Function2<T, T, T>, Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD has a single element generated by reducing all elements in a sliding window over this DStream.
reduceByWindow(Function2<T, T, T>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD has a single element generated by reducing all elements in a sliding window over this DStream.
reduceByWindow(Function2<T, T, T>, Function2<T, T, T>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD has a single element generated by reducing all elements in a sliding window over this DStream.
ReduceFunction<T> - Interface in org.apache.spark.api.java.function: Base interface for function used in Dataset's reduce.
reduceGroups(Function2<V, V, V>) - Method in class org.apache.spark.sql.KeyValueGroupedDataset: (Scala-specific) Reduces the elements of each group of data using the specified binary function.
reduceGroups(ReduceFunction<V>) - Method in class org.apache.spark.sql.KeyValueGroupedDataset: (Java-specific) Reduces the elements of each group of data using the specified binary function.
reduceId() - Method in class org.apache.spark.FetchFailed
reduceId() - Method in class org.apache.spark.storage.ShuffleBlockId
reduceId() - Method in class org.apache.spark.storage.ShuffleDataBlockId
reduceId() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
references() - Method in class org.apache.spark.sql.sources.And
references() - Method in class org.apache.spark.sql.sources.EqualNullSafe
references() - Method in class org.apache.spark.sql.sources.EqualTo
references() - Method in class org.apache.spark.sql.sources.Filter: List of columns that are referenced by this filter.
references() - Method in class org.apache.spark.sql.sources.GreaterThan
references() - Method in class org.apache.spark.sql.sources.GreaterThanOrEqual
references() - Method in class org.apache.spark.sql.sources.In
references() - Method in class org.apache.spark.sql.sources.IsNotNull
references() - Method in class org.apache.spark.sql.sources.IsNull
references() - Method in class org.apache.spark.sql.sources.LessThan
references() - Method in class org.apache.spark.sql.sources.LessThanOrEqual
references() - Method in class org.apache.spark.sql.sources.Not
references() - Method in class org.apache.spark.sql.sources.Or
references() - Method in class org.apache.spark.sql.sources.StringContains
references() - Method in class org.apache.spark.sql.sources.StringEndsWith
references() - Method in class org.apache.spark.sql.sources.StringStartsWith
refreshByPath(String) - Method in class org.apache.spark.sql.catalog.Catalog: Invalidates and refreshes all the cached data (and the associated metadata) for any Dataset that contains the given data source path.
refreshTable(String) - Method in class org.apache.spark.sql.catalog.Catalog: Invalidates and refreshes all the cached data and metadata of the given table.
refreshTable(String) - Method in class org.apache.spark.sql.hive.HiveContext: Deprecated.

Invalidate and refresh all the cached the metadata of the given table.
regex(Regex) - Static method in class org.apache.spark.ml.feature.RFormulaParser
regexFromString(String, String) - Static method in class org.apache.spark.internal.config.ConfigHelpers
regexp_extract(Column, String, int) - Static method in class org.apache.spark.sql.functions: Extract a specific group matched by a Java regex, from the specified string column.
regexp_replace(Column, String, String) - Static method in class org.apache.spark.sql.functions: Replace all substrings of the specified string value that match regexp with rep.
regexp_replace(Column, Column, Column) - Static method in class org.apache.spark.sql.functions: Replace all substrings of the specified string value that match regexp with rep.
RegexTokenizer - Class in org.apache.spark.ml.feature: A regex based tokenizer that extracts tokens either by using the provided regex pattern to split the text (default) or repeatedly matching the regex (if gaps is false).
RegexTokenizer(String) - Constructor for class org.apache.spark.ml.feature.RegexTokenizer
RegexTokenizer() - Constructor for class org.apache.spark.ml.feature.RegexTokenizer
register(AccumulatorV2<?, ?>) - Method in class org.apache.spark.SparkContext: Register the given accumulator.
register(AccumulatorV2<?, ?>, String) - Method in class org.apache.spark.SparkContext: Register the given accumulator with given name.
register(String, String) - Static method in class org.apache.spark.sql.types.UDTRegistration: Registers an UserDefinedType to an user class.
register(String, UserDefinedAggregateFunction) - Method in class org.apache.spark.sql.UDFRegistration: Registers a user-defined aggregate function (UDAF).
register(String, UserDefinedFunction) - Method in class org.apache.spark.sql.UDFRegistration: Registers a user-defined function (UDF), for a UDF that's already defined using the Dataset API (i.e.
register(String, Function0<RT>, TypeTags.TypeTag<RT>) - Method in class org.apache.spark.sql.UDFRegistration: Registers a deterministic Scala closure of 0 arguments as user-defined function (UDF).
register(String, Function1<A1, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>) - Method in class org.apache.spark.sql.UDFRegistration: Registers a deterministic Scala closure of 1 arguments as user-defined function (UDF).
register(String, Function2<A1, A2, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>) - Method in class org.apache.spark.sql.UDFRegistration: Registers a deterministic Scala closure of 2 arguments as user-defined function (UDF).
register(String, Function3<A1, A2, A3, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>) - Method in class org.apache.spark.sql.UDFRegistration: Registers a deterministic Scala closure of 3 arguments as user-defined function (UDF).
register(String, Function4<A1, A2, A3, A4, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>) - Method in class org.apache.spark.sql.UDFRegistration: Registers a deterministic Scala closure of 4 arguments as user-defined function (UDF).
register(String, Function5<A1, A2, A3, A4, A5, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>) - Method in class org.apache.spark.sql.UDFRegistration: Registers a deterministic Scala closure of 5 arguments as user-defined function (UDF).
register(String, Function6<A1, A2, A3, A4, A5, A6, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>) - Method in class org.apache.spark.sql.UDFRegistration: Registers a deterministic Scala closure of 6 arguments as user-defined function (UDF).
register(String, Function7<A1, A2, A3, A4, A5, A6, A7, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>) - Method in class org.apache.spark.sql.UDFRegistration: Registers a deterministic Scala closure of 7 arguments as user-defined function (UDF).
register(String, Function8<A1, A2, A3, A4, A5, A6, A7, A8, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>) - Method in class org.apache.spark.sql.UDFRegistration: Registers a deterministic Scala closure of 8 arguments as user-defined function (UDF).
register(String, Function9<A1, A2, A3, A4, A5, A6, A7, A8, A9, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>) - Method in class org.apache.spark.sql.UDFRegistration: Registers a deterministic Scala closure of 9 arguments as user-defined function (UDF).
register(String, Function10<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>) - Method in class org.apache.spark.sql.UDFRegistration: Registers a deterministic Scala closure of 10 arguments as user-defined function (UDF).
register(String, Function11<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>) - Method in class org.apache.spark.sql.UDFRegistration: Registers a deterministic Scala closure of 11 arguments as user-defined function (UDF).
register(String, Function12<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>) - Method in class org.apache.spark.sql.UDFRegistration: Registers a deterministic Scala closure of 12 arguments as user-defined function (UDF).
register(String, Function13<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>) - Method in class org.apache.spark.sql.UDFRegistration: Registers a deterministic Scala closure of 13 arguments as user-defined function (UDF).
register(String, Function14<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>) - Method in class org.apache.spark.sql.UDFRegistration: Registers a deterministic Scala closure of 14 arguments as user-defined function (UDF).
register(String, Function15<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>) - Method in class org.apache.spark.sql.UDFRegistration: Registers a deterministic Scala closure of 15 arguments as user-defined function (UDF).
register(String, Function16<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>) - Method in class org.apache.spark.sql.UDFRegistration: Registers a deterministic Scala closure of 16 arguments as user-defined function (UDF).
register(String, Function17<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>) - Method in class org.apache.spark.sql.UDFRegistration: Registers a deterministic Scala closure of 17 arguments as user-defined function (UDF).
register(String, Function18<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, A18, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>, TypeTags.TypeTag<A18>) - Method in class org.apache.spark.sql.UDFRegistration: Registers a deterministic Scala closure of 18 arguments as user-defined function (UDF).
register(String, Function19<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, A18, A19, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>, TypeTags.TypeTag<A18>, TypeTags.TypeTag<A19>) - Method in class org.apache.spark.sql.UDFRegistration: Registers a deterministic Scala closure of 19 arguments as user-defined function (UDF).
register(String, Function20<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, A18, A19, A20, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>, TypeTags.TypeTag<A18>, TypeTags.TypeTag<A19>, TypeTags.TypeTag<A20>) - Method in class org.apache.spark.sql.UDFRegistration: Registers a deterministic Scala closure of 20 arguments as user-defined function (UDF).
register(String, Function21<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, A18, A19, A20, A21, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>, TypeTags.TypeTag<A18>, TypeTags.TypeTag<A19>, TypeTags.TypeTag<A20>, TypeTags.TypeTag<A21>) - Method in class org.apache.spark.sql.UDFRegistration: Registers a deterministic Scala closure of 21 arguments as user-defined function (UDF).
register(String, Function22<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, A18, A19, A20, A21, A22, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>, TypeTags.TypeTag<A18>, TypeTags.TypeTag<A19>, TypeTags.TypeTag<A20>, TypeTags.TypeTag<A21>, TypeTags.TypeTag<A22>) - Method in class org.apache.spark.sql.UDFRegistration: Registers a deterministic Scala closure of 22 arguments as user-defined function (UDF).
register(String, UDF0<?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a deterministic Java UDF0 instance as user-defined function (UDF).
register(String, UDF1<?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a deterministic Java UDF1 instance as user-defined function (UDF).
register(String, UDF2<?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a deterministic Java UDF2 instance as user-defined function (UDF).
register(String, UDF3<?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a deterministic Java UDF3 instance as user-defined function (UDF).
register(String, UDF4<?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a deterministic Java UDF4 instance as user-defined function (UDF).
register(String, UDF5<?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a deterministic Java UDF5 instance as user-defined function (UDF).
register(String, UDF6<?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a deterministic Java UDF6 instance as user-defined function (UDF).
register(String, UDF7<?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a deterministic Java UDF7 instance as user-defined function (UDF).
register(String, UDF8<?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a deterministic Java UDF8 instance as user-defined function (UDF).
register(String, UDF9<?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a deterministic Java UDF9 instance as user-defined function (UDF).
register(String, UDF10<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a deterministic Java UDF10 instance as user-defined function (UDF).
register(String, UDF11<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a deterministic Java UDF11 instance as user-defined function (UDF).
register(String, UDF12<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a deterministic Java UDF12 instance as user-defined function (UDF).
register(String, UDF13<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a deterministic Java UDF13 instance as user-defined function (UDF).
register(String, UDF14<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a deterministic Java UDF14 instance as user-defined function (UDF).
register(String, UDF15<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a deterministic Java UDF15 instance as user-defined function (UDF).
register(String, UDF16<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a deterministic Java UDF16 instance as user-defined function (UDF).
register(String, UDF17<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a deterministic Java UDF17 instance as user-defined function (UDF).
register(String, UDF18<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a deterministic Java UDF18 instance as user-defined function (UDF).
register(String, UDF19<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a deterministic Java UDF19 instance as user-defined function (UDF).
register(String, UDF20<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a deterministic Java UDF20 instance as user-defined function (UDF).
register(String, UDF21<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a deterministic Java UDF21 instance as user-defined function (UDF).
register(String, UDF22<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a deterministic Java UDF22 instance as user-defined function (UDF).
register(QueryExecutionListener) - Method in class org.apache.spark.sql.util.ExecutionListenerManager: Registers the specified QueryExecutionListener.
register(AccumulatorV2<?, ?>) - Static method in class org.apache.spark.util.AccumulatorContext: Registers an AccumulatorV2 created on the driver such that it can be used on the executors.
register(String, Function0<Object>) - Static method in class org.apache.spark.util.SignalUtils: Adds an action to be run when a given signal is received by this process.
registerAvroSchemas(Seq<Schema>) - Method in class org.apache.spark.SparkConf: Use Kryo serialization and register the given set of Avro schemas so that the generic record serializer can decrease network IO
RegisterBlockManager(BlockManagerId, long, long, org.apache.spark.rpc.RpcEndpointRef) - Constructor for class org.apache.spark.storage.BlockManagerMessages.RegisterBlockManager
RegisterBlockManager$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.RegisterBlockManager$
registerClasses(Kryo) - Method in interface org.apache.spark.serializer.KryoRegistrator
RegisterClusterManager(org.apache.spark.rpc.RpcEndpointRef) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterClusterManager
RegisterClusterManager$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterClusterManager$
registerDialect(JdbcDialect) - Static method in class org.apache.spark.sql.jdbc.JdbcDialects: Register a dialect for use on all new matching jdbc org.apache.spark.sql.DataFrame.
RegisteredExecutor$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisteredExecutor$
RegisterExecutor(String, org.apache.spark.rpc.RpcEndpointRef, String, int, Map<String, String>) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor
RegisterExecutor$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor$
RegisterExecutorFailed(String) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutorFailed
RegisterExecutorFailed$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutorFailed$
registerKryoClasses(SparkConf) - Static method in class org.apache.spark.graphx.GraphXUtils: Registers classes that GraphX uses with Kryo.
registerKryoClasses(SparkContext) - Static method in class org.apache.spark.ml.evaluation.SquaredEuclideanSilhouette: This method registers the class SquaredEuclideanSilhouette.ClusterStats for kryo serialization.
registerKryoClasses(Class<?>[]) - Method in class org.apache.spark.SparkConf: Use Kryo serialization and register the given set of classes with Kryo.
registerLogger(Logger) - Static method in class org.apache.spark.util.SignalUtils: Register a signal handler to log signals on UNIX-like systems.
registerShutdownDeleteDir(File) - Static method in class org.apache.spark.util.ShutdownHookManager
registerStream(DStream<BinarySample>) - Method in class org.apache.spark.mllib.stat.test.StreamingTest: Register a DStream of values for significance testing.
registerStream(JavaDStream<BinarySample>) - Method in class org.apache.spark.mllib.stat.test.StreamingTest: Register a JavaDStream of values for significance testing.
registerTempTable(String) - Method in class org.apache.spark.sql.Dataset: Deprecated.
Use createOrReplaceTempView(viewName) instead. Since 2.0.0.
regParam() - Method in interface org.apache.spark.ml.optim.loss.DifferentiableRegularization: Magnitude of the regularization penalty.
regParam() - Method in interface org.apache.spark.ml.param.shared.HasRegParam: Param for regularization parameter (>= 0).
Regression() - Static method in class org.apache.spark.mllib.tree.configuration.Algo
RegressionEvaluator - Class in org.apache.spark.ml.evaluation: :: Experimental :: Evaluator for regression, which expects two input columns: prediction and label.
RegressionEvaluator(String) - Constructor for class org.apache.spark.ml.evaluation.RegressionEvaluator
RegressionEvaluator() - Constructor for class org.apache.spark.ml.evaluation.RegressionEvaluator
RegressionMetrics - Class in org.apache.spark.mllib.evaluation: Evaluator for regression.
RegressionMetrics(RDD<Tuple2<Object, Object>>, boolean) - Constructor for class org.apache.spark.mllib.evaluation.RegressionMetrics
RegressionMetrics(RDD<Tuple2<Object, Object>>) - Constructor for class org.apache.spark.mllib.evaluation.RegressionMetrics
RegressionModel<FeaturesType,M extends RegressionModel<FeaturesType,M>> - Class in org.apache.spark.ml.regression: :: DeveloperApi ::
RegressionModel() - Constructor for class org.apache.spark.ml.regression.RegressionModel
RegressionModel - Interface in org.apache.spark.mllib.regression
reindex() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
reindex() - Method in class org.apache.spark.graphx.VertexRDD: Construct a new VertexRDD that is indexed by only the visible vertices.
RelationalGroupedDataset - Class in org.apache.spark.sql: A set of methods for aggregations on a DataFrame, created by groupBy, cube or rollup (and also pivot).
RelationalGroupedDataset.CubeType$ - Class in org.apache.spark.sql: To indicate it's the CUBE
RelationalGroupedDataset.GroupByType$ - Class in org.apache.spark.sql: To indicate it's the GroupBy
RelationalGroupedDataset.GroupType - Interface in org.apache.spark.sql: The Grouping Type
RelationalGroupedDataset.PivotType$ - Class in org.apache.spark.sql
RelationalGroupedDataset.RollupType$ - Class in org.apache.spark.sql: To indicate it's the ROLLUP
RelationConversions - Class in org.apache.spark.sql.hive: Relation conversion from metastore relations to data source relations for better performance
RelationConversions(SQLConf, HiveSessionCatalog) - Constructor for class org.apache.spark.sql.hive.RelationConversions
RelationProvider - Interface in org.apache.spark.sql.sources: Implemented by objects that produce relations for a specific kind of data source.
relativeDirection(long) - Method in class org.apache.spark.graphx.Edge: Return the relative direction of the edge to the corresponding vertex.
relativeError() - Method in interface org.apache.spark.ml.feature.QuantileDiscretizerBase: Relative error (see documentation for org.apache.spark.sql.DataFrameStatFunctions.approxQuantile for description) Must be in the range [0, 1].
relativeError() - Method in class org.apache.spark.util.sketch.CountMinSketch: Returns the relative error (or eps) of this CountMinSketch.
rem(Decimal, Decimal) - Method in class org.apache.spark.sql.types.Decimal.DecimalAsIfIntegral$
remainder(Decimal) - Method in class org.apache.spark.sql.types.Decimal
remember(Duration) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Sets each DStreams in this context to remember RDDs it generated in the last given duration.
remember(Duration) - Method in class org.apache.spark.streaming.StreamingContext: Set each DStream in this context to remember RDDs it generated in the last given duration.
REMOTE_BLOCKS_FETCHED() - Method in class org.apache.spark.InternalAccumulator.shuffleRead$
REMOTE_BYTES_READ() - Method in class org.apache.spark.InternalAccumulator.shuffleRead$
REMOTE_BYTES_READ_TO_DISK() - Method in class org.apache.spark.InternalAccumulator.shuffleRead$
remoteBlocksFetched() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetricDistributions
remoteBlocksFetched() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetrics
remoteBytesRead() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetricDistributions
remoteBytesRead() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetrics
remoteBytesReadToDisk() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetricDistributions
remoteBytesReadToDisk() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetrics
remove(Param<T>) - Method in class org.apache.spark.ml.param.ParamMap: Removes a key from this map and returns its value associated previously as an option.
remove(String) - Method in class org.apache.spark.SparkConf: Remove a parameter from the configuration
remove() - Method in interface org.apache.spark.sql.streaming.GroupState: Remove this state.
remove(String) - Method in class org.apache.spark.sql.types.MetadataBuilder
remove() - Method in class org.apache.spark.streaming.State: Remove the state if it exists.
remove(long) - Static method in class org.apache.spark.util.AccumulatorContext: Unregisters the AccumulatorV2 with the given ID, if any.
RemoveBlock(BlockId) - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveBlock
RemoveBlock$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveBlock$
RemoveBroadcast(long, boolean) - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveBroadcast
RemoveBroadcast$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveBroadcast$
removeDistribution(LiveExecutor) - Method in class org.apache.spark.status.LiveRDD
RemoveExecutor(String, org.apache.spark.scheduler.ExecutorLossReason) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveExecutor
RemoveExecutor(String) - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveExecutor
RemoveExecutor$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveExecutor$
RemoveExecutor$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveExecutor$
removeFromDriver() - Method in class org.apache.spark.storage.BlockManagerMessages.RemoveBroadcast
removeListener(StreamingQueryListener) - Method in class org.apache.spark.sql.streaming.StreamingQueryManager: Deregister a StreamingQueryListener.
removeListener(L) - Method in interface org.apache.spark.util.ListenerBus: Remove a listener and it won't receive any events.
removeListenerOnError(SparkListenerInterface) - Method in class org.apache.spark.scheduler.AsyncEventQueue
removeListenerOnError(L) - Method in interface org.apache.spark.util.ListenerBus: This can be overridden by subclasses if there is any extra cleanup to do when removing a listener.
removeMapOutput(int, BlockManagerId) - Method in class org.apache.spark.ShuffleStatus: Remove the map output which was served by the specified block manager.
removeOutputsByFilter(Function1<BlockManagerId, Object>) - Method in class org.apache.spark.ShuffleStatus: Removes all shuffle outputs which satisfies the filter.
removeOutputsOnExecutor(String) - Method in class org.apache.spark.ShuffleStatus: Removes all map outputs associated with the specified executor.
removeOutputsOnHost(String) - Method in class org.apache.spark.ShuffleStatus: Removes all shuffle outputs associated with this host.
removePartition(String) - Method in class org.apache.spark.status.LiveRDD
removePartition(LiveRDDPartition) - Method in class org.apache.spark.status.RDDPartitionSeq
RemoveRdd(int) - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveRdd
RemoveRdd$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveRdd$
removeReason() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
removeReason() - Method in class org.apache.spark.status.LiveExecutor
removeSchedulable(Schedulable) - Method in interface org.apache.spark.scheduler.Schedulable
removeSelfEdges() - Method in class org.apache.spark.graphx.GraphOps: Remove self edges.
RemoveShuffle(int) - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveShuffle
RemoveShuffle$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.RemoveShuffle$
removeShutdownDeleteDir(File) - Static method in class org.apache.spark.util.ShutdownHookManager
removeShutdownHook(Object) - Static method in class org.apache.spark.util.ShutdownHookManager: Remove a previously installed shutdown hook.
removeSparkListener(SparkListenerInterface) - Method in class org.apache.spark.SparkContext: :: DeveloperApi :: Deregister the listener from Spark's listener bus.
removeStreamingListener(StreamingListener) - Method in class org.apache.spark.streaming.StreamingContext
removeTime() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
removeTime() - Method in class org.apache.spark.status.LiveExecutor
RemoveWorker(String, String, String) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveWorker
RemoveWorker$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveWorker$
renameFunction(String, String, String) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Rename an existing function in the database.
renamePartitions(String, String, Seq<Map<String, String>>, Seq<Map<String, String>>) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Rename one or many existing table partitions, assuming they exist.
rep(Function0<Parsers.Parser<T>>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
rep1(Function0<Parsers.Parser<T>>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
rep1(Function0<Parsers.Parser<T>>, Function0<Parsers.Parser<T>>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
rep1sep(Function0<Parsers.Parser<T>>, Function0<Parsers.Parser<Object>>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
repartition(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a new RDD that has exactly numPartitions partitions.
repartition(int) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a new RDD that has exactly numPartitions partitions.
repartition(int) - Method in class org.apache.spark.api.java.JavaRDD: Return a new RDD that has exactly numPartitions partitions.
repartition(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Return a new RDD that has exactly numPartitions partitions.
repartition(int, Column...) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset partitioned by the given partitioning expressions into numPartitions.
repartition(Column...) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset partitioned by the given partitioning expressions, using spark.sql.shuffle.partitions as number of partitions.
repartition(int) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset that has exactly numPartitions partitions.
repartition(int, Seq<Column>) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset partitioned by the given partitioning expressions into numPartitions.
repartition(Seq<Column>) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset partitioned by the given partitioning expressions, using spark.sql.shuffle.partitions as number of partitions.
repartition(int) - Method in class org.apache.spark.streaming.api.java.JavaDStream: Return a new DStream with an increased or decreased level of parallelism.
repartition(int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream with an increased or decreased level of parallelism.
repartition(int) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream with an increased or decreased level of parallelism.
repartitionAndSortWithinPartitions(Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Repartition the RDD according to the given partitioner and, within each resulting partition, sort records by their keys.
repartitionAndSortWithinPartitions(Partitioner, Comparator<K>) - Method in class org.apache.spark.api.java.JavaPairRDD: Repartition the RDD according to the given partitioner and, within each resulting partition, sort records by their keys.
repartitionAndSortWithinPartitions(Partitioner) - Method in class org.apache.spark.rdd.OrderedRDDFunctions: Repartition the RDD according to the given partitioner and, within each resulting partition, sort records by their keys.
repartitionByRange(int, Column...) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset partitioned by the given partitioning expressions into numPartitions.
repartitionByRange(Column...) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset partitioned by the given partitioning expressions, using spark.sql.shuffle.partitions as number of partitions.
repartitionByRange(int, Seq<Column>) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset partitioned by the given partitioning expressions into numPartitions.
repartitionByRange(Seq<Column>) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset partitioned by the given partitioning expressions, using spark.sql.shuffle.partitions as number of partitions.
repeat(Column, int) - Static method in class org.apache.spark.sql.functions: Repeats a string column n times, and returns it as a new string column.
replace(String, Map<T, T>) - Method in class org.apache.spark.sql.DataFrameNaFunctions: Replaces values matching keys in replacement map with the corresponding values.
replace(String[], Map<T, T>) - Method in class org.apache.spark.sql.DataFrameNaFunctions: Replaces values matching keys in replacement map with the corresponding values.
replace(String, Map<T, T>) - Method in class org.apache.spark.sql.DataFrameNaFunctions: (Scala-specific) Replaces values matching keys in replacement map.
replace(Seq<String>, Map<T, T>) - Method in class org.apache.spark.sql.DataFrameNaFunctions: (Scala-specific) Replaces values matching keys in replacement map.
replaceCharType(DataType) - Static method in class org.apache.spark.sql.types.HiveStringType
replicas() - Method in class org.apache.spark.storage.BlockManagerMessages.ReplicateBlock
ReplicateBlock(BlockId, Seq<BlockManagerId>, int) - Constructor for class org.apache.spark.storage.BlockManagerMessages.ReplicateBlock
ReplicateBlock$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.ReplicateBlock$
replicatedVertexView() - Method in class org.apache.spark.graphx.impl.GraphImpl
replication() - Method in class org.apache.spark.storage.StorageLevel
reply(Object) - Method in interface org.apache.spark.rpc.RpcCallContext: Reply a message to the sender.
repN(int, Function0<Parsers.Parser<T>>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
report() - Method in interface org.apache.spark.metrics.sink.Sink
reportError(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver: Report exceptions in receiving data.
repsep(Function0<Parsers.Parser<T>>, Function0<Parsers.Parser<Object>>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
requestedTotal() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RequestExecutors
requestExecutors(int) - Method in interface org.apache.spark.ExecutorAllocationClient: Request an additional number of executors from the cluster manager.
RequestExecutors(int, int, Map<String, Object>, Set<String>) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RequestExecutors
requestExecutors(int) - Method in class org.apache.spark.SparkContext: :: DeveloperApi :: Request an additional number of executors from the cluster manager.
RequestExecutors$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RequestExecutors$
requestTotalExecutors(int, int, Map<String, Object>) - Method in interface org.apache.spark.ExecutorAllocationClient: Update the cluster manager on our scheduling needs.
requestTotalExecutors(int, int, Map<String, Object>) - Method in class org.apache.spark.SparkContext: Update the cluster manager on our scheduling needs.
res() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
reservoirSampleAndCount(Iterator<T>, int, long, ClassTag<T>) - Static method in class org.apache.spark.util.random.SamplingUtils: Reservoir sampling implementation that also returns the input size.
reset() - Static method in class org.apache.spark.metrics.source.HiveCatalogMetrics: Resets the values of all metrics to zero.
reset() - Method in interface org.apache.spark.sql.hive.client.HiveClient: Used for testing only.
reset() - Method in class org.apache.spark.storage.BufferReleasingInputStream
reset() - Method in class org.apache.spark.util.AccumulatorV2: Resets this accumulator, which is zero value.
reset() - Method in class org.apache.spark.util.CollectionAccumulator
reset() - Method in class org.apache.spark.util.DoubleAccumulator
reset() - Method in class org.apache.spark.util.LegacyAccumulatorWrapper
reset() - Method in class org.apache.spark.util.LongAccumulator
resetTerminated() - Method in class org.apache.spark.sql.streaming.StreamingQueryManager: Forget about past terminated queries so that awaitAnyTermination() can be used again to wait for new terminations.
residualDegreeOfFreedom() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionSummary: The residual degrees of freedom.
residualDegreeOfFreedomNull() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionSummary: The residual degrees of freedom for the null model.
residuals() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionSummary: Get the default residuals (deviance residuals) of the fitted model.
residuals(String) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionSummary: Get the residuals of the fitted model by type.
residuals() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary: Residuals (label - predicted value)
ResolveHiveSerdeTable - Class in org.apache.spark.sql.hive: Determine the database, serde/format and schema of the Hive serde table, according to the storage properties.
ResolveHiveSerdeTable(SparkSession) - Constructor for class org.apache.spark.sql.hive.ResolveHiveSerdeTable
resolveURI(String) - Static method in class org.apache.spark.util.Utils: Return a well-formed URI for the file described by a user input string.
resolveURIs(String) - Static method in class org.apache.spark.util.Utils: Resolve a comma-separated list of paths.
responder() - Method in class org.apache.spark.ui.JettyUtils.ServletParams
responseFromBackup(String) - Static method in class org.apache.spark.util.Utils: Return true if the response message is sent from a backup Master on standby.
restart(String) - Method in class org.apache.spark.streaming.receiver.Receiver: Restart the receiver.
restart(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver: Restart the receiver.
restart(String, Throwable, int) - Method in class org.apache.spark.streaming.receiver.Receiver: Restart the receiver.
ResubmitFailedStages - Class in org.apache.spark.scheduler
ResubmitFailedStages() - Constructor for class org.apache.spark.scheduler.ResubmitFailedStages
Resubmitted - Class in org.apache.spark: :: DeveloperApi :: A org.apache.spark.scheduler.ShuffleMapTask that completed successfully earlier, but we lost the executor before the stage completed.
Resubmitted() - Constructor for class org.apache.spark.Resubmitted
result(Duration, CanAwait) - Method in class org.apache.spark.ComplexFutureAction
result(Duration, CanAwait) - Method in interface org.apache.spark.FutureAction: Awaits and returns the result (of type T) of this action.
result(Duration, CanAwait) - Method in class org.apache.spark.SimpleFutureAction
RESULT_SERIALIZATION_TIME() - Static method in class org.apache.spark.InternalAccumulator
RESULT_SERIALIZATION_TIME() - Static method in class org.apache.spark.ui.jobs.TaskDetailsClassNames
RESULT_SERIALIZATION_TIME() - Static method in class org.apache.spark.ui.ToolTips
RESULT_SIZE() - Static method in class org.apache.spark.InternalAccumulator
RESULT_SIZE() - Static method in class org.apache.spark.status.TaskIndexNames
resultFetchStart() - Method in class org.apache.spark.status.api.v1.TaskData
resultSerializationTime() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
resultSerializationTime() - Method in class org.apache.spark.status.api.v1.TaskMetrics
resultSetToObjectArray(ResultSet) - Static method in class org.apache.spark.rdd.JdbcRDD
resultSize() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
resultSize() - Method in class org.apache.spark.status.api.v1.TaskMetrics
RetrieveLastAllocatedExecutorId$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RetrieveLastAllocatedExecutorId$
RetrieveSparkAppConfig$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RetrieveSparkAppConfig$
retryWaitMs(SparkConf) - Static method in class org.apache.spark.util.RpcUtils: Returns the configured number of milliseconds to wait on each retry
ReturnStatementFinder - Class in org.apache.spark.util
ReturnStatementFinder(Option<String>) - Constructor for class org.apache.spark.util.ReturnStatementFinder
reverse() - Method in class org.apache.spark.graphx.EdgeDirection: Reverse the direction of an edge.
reverse() - Method in class org.apache.spark.graphx.EdgeRDD: Reverse all the edges in this RDD.
reverse() - Method in class org.apache.spark.graphx.Graph: Reverses all edges in the graph.
reverse() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
reverse() - Method in class org.apache.spark.graphx.impl.GraphImpl
reverse(Column) - Static method in class org.apache.spark.sql.functions: Returns a reversed string or an array with reverse order of elements.
reverseRoutingTables() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
reverseRoutingTables() - Method in class org.apache.spark.graphx.VertexRDD: Returns a new VertexRDD reflecting a reversal of all edge directions in the corresponding EdgeRDD.
ReviveOffers - Class in org.apache.spark.scheduler.local
ReviveOffers() - Constructor for class org.apache.spark.scheduler.local.ReviveOffers
reviveOffers() - Method in interface org.apache.spark.scheduler.SchedulerBackend
ReviveOffers$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.ReviveOffers$
RFormula - Class in org.apache.spark.ml.feature: :: Experimental :: Implements the transforms required for fitting a dataset against an R model formula.
RFormula(String) - Constructor for class org.apache.spark.ml.feature.RFormula
RFormula() - Constructor for class org.apache.spark.ml.feature.RFormula
RFormulaBase - Interface in org.apache.spark.ml.feature: Base trait for RFormula and RFormulaModel.
RFormulaModel - Class in org.apache.spark.ml.feature: :: Experimental :: Model fitted by RFormula.
RFormulaParser - Class in org.apache.spark.ml.feature: Limited implementation of R formula parsing.
RFormulaParser() - Constructor for class org.apache.spark.ml.feature.RFormulaParser
RidgeRegressionModel - Class in org.apache.spark.mllib.regression: Regression model trained using RidgeRegression.
RidgeRegressionModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.RidgeRegressionModel
RidgeRegressionWithSGD - Class in org.apache.spark.mllib.regression: Train a regression model with L2-regularization using Stochastic Gradient Descent.
RidgeRegressionWithSGD() - Constructor for class org.apache.spark.mllib.regression.RidgeRegressionWithSGD: Deprecated.
Use ml.regression.LinearRegression with elasticNetParam = 0.0. Note the default regParam is 0.01 for RidgeRegressionWithSGD, but is 0.0 for LinearRegression. Since 2.0.0.
right() - Method in class org.apache.spark.sql.sources.And
right() - Method in class org.apache.spark.sql.sources.Or
rightCategories() - Method in class org.apache.spark.ml.tree.CategoricalSplit: Get sorted categories which split to the right
rightChild() - Method in class org.apache.spark.ml.tree.DecisionTreeModelReadWrite.NodeData
rightChild() - Method in class org.apache.spark.ml.tree.InternalNode
rightChildIndex(int) - Static method in class org.apache.spark.mllib.tree.model.Node: Return the index of the right child of this node.
rightImpurity() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
rightNode() - Method in class org.apache.spark.mllib.tree.model.Node
rightNodeId() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
rightOuterJoin(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Perform a right outer join of this and other.
rightOuterJoin(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD: Perform a right outer join of this and other.
rightOuterJoin(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Perform a right outer join of this and other.
rightOuterJoin(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions: Perform a right outer join of this and other.
rightOuterJoin(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Perform a right outer join of this and other.
rightOuterJoin(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: Perform a right outer join of this and other.
rightOuterJoin(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightOuterJoin(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightOuterJoin(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightOuterJoin(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightOuterJoin(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightOuterJoin(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightPredict() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
rint(Column) - Static method in class org.apache.spark.sql.functions: Returns the double value that is closest in value to the argument and is equal to a mathematical integer.
rint(String) - Static method in class org.apache.spark.sql.functions: Returns the double value that is closest in value to the argument and is equal to a mathematical integer.
rlike(String) - Method in class org.apache.spark.sql.Column: SQL RLIKE expression (LIKE with Regex).
RMATa() - Static method in class org.apache.spark.graphx.util.GraphGenerators
RMATb() - Static method in class org.apache.spark.graphx.util.GraphGenerators
RMATc() - Static method in class org.apache.spark.graphx.util.GraphGenerators
RMATd() - Static method in class org.apache.spark.graphx.util.GraphGenerators
rmatGraph(SparkContext, int, int) - Static method in class org.apache.spark.graphx.util.GraphGenerators: A random graph generator using the R-MAT model, proposed in "R-MAT: A Recursive Model for Graph Mining" by Chakrabarti et al.
rnd() - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer
roc() - Method in interface org.apache.spark.ml.classification.BinaryLogisticRegressionSummary: Returns the receiver operating characteristic (ROC) curve, which is a Dataframe having two fields (FPR, TPR) with (0.0, 0.0) prepended and (1.0, 1.0) appended to it.
roc() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Returns the receiver operating characteristic (ROC) curve, which is an RDD of (false positive rate, true positive rate) with (0.0, 0.0) prepended and (1.0, 1.0) appended to it.
rolledOver() - Method in interface org.apache.spark.util.logging.RollingPolicy: Notify that rollover has occurred
RollingPolicy - Interface in org.apache.spark.util.logging: Defines the policy based on which RollingFileAppender will generate rolling files.
rollup(Column...) - Method in class org.apache.spark.sql.Dataset: Create a multi-dimensional rollup for the current Dataset using the specified columns, so we can run aggregation on them.
rollup(String, String...) - Method in class org.apache.spark.sql.Dataset: Create a multi-dimensional rollup for the current Dataset using the specified columns, so we can run aggregation on them.
rollup(Seq<Column>) - Method in class org.apache.spark.sql.Dataset: Create a multi-dimensional rollup for the current Dataset using the specified columns, so we can run aggregation on them.
rollup(String, Seq<String>) - Method in class org.apache.spark.sql.Dataset: Create a multi-dimensional rollup for the current Dataset using the specified columns, so we can run aggregation on them.
RollupType$() - Constructor for class org.apache.spark.sql.RelationalGroupedDataset.RollupType$
rootMeanSquaredError() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary: Returns the root mean squared error, which is defined as the square root of the mean squared error.
rootMeanSquaredError() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics: Returns the root mean squared error, which is defined as the square root of the mean squared error.
rootNode() - Method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
rootNode() - Method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
rootNode() - Method in interface org.apache.spark.ml.tree.DecisionTreeModel: Root of the decision tree
rootPool() - Method in interface org.apache.spark.scheduler.SchedulableBuilder
rootPool() - Method in interface org.apache.spark.scheduler.TaskScheduler
round(Column) - Static method in class org.apache.spark.sql.functions: Returns the value of the column e rounded to 0 decimal places with HALF_UP round mode.
round(Column, int) - Static method in class org.apache.spark.sql.functions: Round the value of e to scale decimal places with HALF_UP round mode if scale is greater than or equal to 0 or at integral part when scale is less than 0.
ROUND_CEILING() - Static method in class org.apache.spark.sql.types.Decimal
ROUND_FLOOR() - Static method in class org.apache.spark.sql.types.Decimal
ROUND_HALF_EVEN() - Static method in class org.apache.spark.sql.types.Decimal
ROUND_HALF_UP() - Static method in class org.apache.spark.sql.types.Decimal
ROW() - Static method in class org.apache.spark.api.r.SerializationFormats
Row - Interface in org.apache.spark.sql: Represents one row of output from a relational operator.
row(T) - Method in interface org.apache.spark.ui.PagedTable
row_number() - Static method in class org.apache.spark.sql.functions: Window function: returns a sequential number starting at 1 within a window partition.
RowFactory - Class in org.apache.spark.sql: A factory class used to construct Row objects.
RowFactory() - Constructor for class org.apache.spark.sql.RowFactory
rowIndices() - Method in class org.apache.spark.ml.linalg.SparseMatrix
rowIndices() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
rowIter() - Method in interface org.apache.spark.ml.linalg.Matrix: Returns an iterator of row vectors.
rowIter() - Method in interface org.apache.spark.mllib.linalg.Matrix: Returns an iterator of row vectors.
rowIterator() - Method in class org.apache.spark.sql.vectorized.ColumnarBatch: Returns an iterator over the rows in this batch.
RowMatrix - Class in org.apache.spark.mllib.linalg.distributed: Represents a row-oriented distributed Matrix with no meaningful row indices.
RowMatrix(RDD<Vector>, long, int) - Constructor for class org.apache.spark.mllib.linalg.distributed.RowMatrix
RowMatrix(RDD<Vector>) - Constructor for class org.apache.spark.mllib.linalg.distributed.RowMatrix: Alternative constructor leaving matrix dimensions to be determined automatically.
rows() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
rows() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
rowsBetween(long, long) - Static method in class org.apache.spark.sql.expressions.Window: Creates a WindowSpec with the frame boundaries defined, from start (inclusive) to end (inclusive).
rowsBetween(long, long) - Method in class org.apache.spark.sql.expressions.WindowSpec: Defines the frame boundaries, from start (inclusive) to end (inclusive).
rowsPerBlock() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
rPackages() - Static method in class org.apache.spark.api.r.RUtils
rpad(Column, int, String) - Static method in class org.apache.spark.sql.functions: Right-pad the string column with pad to a length of len.
RpcCallContext - Interface in org.apache.spark.rpc: A callback that RpcEndpoint can use to send back a message or failure.
RpcEndpoint - Interface in org.apache.spark.rpc: An end point for the RPC that defines what functions to trigger given a message.
rpcEnv() - Method in interface org.apache.spark.rpc.RpcEndpoint: The RpcEnv that this RpcEndpoint is registered to.
RpcEnvFactory - Interface in org.apache.spark.rpc: A factory class to create the RpcEnv.
RpcEnvFileServer - Interface in org.apache.spark.rpc: A server used by the RpcEnv to server files to other processes owned by the application.
RpcUtils - Class in org.apache.spark.util
RpcUtils() - Constructor for class org.apache.spark.util.RpcUtils
RRDD<T> - Class in org.apache.spark.api.r: An RDD that stores serialized R objects as Array[Byte].
RRDD(RDD<T>, byte[], String, String, byte[], Object[], ClassTag<T>) - Constructor for class org.apache.spark.api.r.RRDD
RRunnerModes - Class in org.apache.spark.api.r
RRunnerModes() - Constructor for class org.apache.spark.api.r.RRunnerModes
rtrim(Column) - Static method in class org.apache.spark.sql.functions: Trim the spaces from right end for the specified string value.
rtrim(Column, String) - Static method in class org.apache.spark.sql.functions: Trim the specified character string from right end for the specified string column.
ruleName() - Static method in class org.apache.spark.sql.hive.HiveAnalysis
run(Graph<VD, ED>, int, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.ConnectedComponents: Compute the connected component membership of each vertex and return a graph with the vertex value containing the lowest vertex id in the connected component containing that vertex.
run(Graph<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.ConnectedComponents: Compute the connected component membership of each vertex and return a graph with the vertex value containing the lowest vertex id in the connected component containing that vertex.
run(Graph<VD, ED>, int, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.LabelPropagation: Run static Label Propagation for detecting communities in networks.
run(Graph<VD, ED>, int, double, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.PageRank: Run PageRank for a fixed number of iterations returning a graph with vertex attributes containing the PageRank and edge attributes the normalized edge weight.
run(Graph<VD, ED>, Seq<Object>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.ShortestPaths: Computes shortest paths to the given set of landmark vertices.
run(Graph<VD, ED>, int, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.StronglyConnectedComponents: Compute the strongly connected component (SCC) of each vertex and return a graph with the vertex value containing the lowest vertex id in the SCC containing that vertex.
run(RDD<Edge<Object>>, SVDPlusPlus.Conf) - Static method in class org.apache.spark.graphx.lib.SVDPlusPlus: Implement SVD++ based on "Factorization Meets the Neighborhood: a Multifaceted Collaborative Filtering Model", available at here.
run(Graph<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.TriangleCount
run(RDD<LabeledPoint>, BoostingStrategy, long, String) - Static method in class org.apache.spark.ml.tree.impl.GradientBoostedTrees: Method to train a gradient boosting model
run(RDD<LabeledPoint>, Strategy, int, String, long, Option<org.apache.spark.ml.util.Instrumentation>, boolean, Option<String>) - Static method in class org.apache.spark.ml.tree.impl.RandomForest: Train a random forest.
run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.classification.LogisticRegressionWithLBFGS: Run Logistic Regression with the configured parameters on an input RDD of LabeledPoint entries.
run(RDD<LabeledPoint>, Vector) - Method in class org.apache.spark.mllib.classification.LogisticRegressionWithLBFGS: Run Logistic Regression with the configured parameters on an input RDD of LabeledPoint entries starting from the initial weights provided.
run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.classification.NaiveBayes: Run the algorithm with the configured parameters on an input RDD of LabeledPoint entries.
run(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.BisectingKMeans: Runs the bisecting k-means algorithm.
run(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.clustering.BisectingKMeans: Java-friendly version of run().
run(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.GaussianMixture: Perform expectation maximization
run(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.clustering.GaussianMixture: Java-friendly version of run()
run(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeans: Train a K-means model on the given set of points; data should be cached for high performance, because this is an iterative algorithm.
run(RDD<Tuple2<Object, Vector>>) - Method in class org.apache.spark.mllib.clustering.LDA: Learn an LDA model using the given dataset.
run(JavaPairRDD<Long, Vector>) - Method in class org.apache.spark.mllib.clustering.LDA: Java-friendly version of run()
run(Graph<Object, Object>) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering: Run the PIC algorithm on Graph.
run(RDD<Tuple3<Object, Object, Object>>) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering: Run the PIC algorithm.
run(JavaRDD<Tuple3<Long, Long, Double>>) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering: A Java-friendly version of PowerIterationClustering.run.
run(RDD<FPGrowth.FreqItemset<Item>>, ClassTag<Item>) - Method in class org.apache.spark.mllib.fpm.AssociationRules: Computes the association rules with confidence above minConfidence.
run(RDD<FPGrowth.FreqItemset<Item>>, Map<Item, Object>, ClassTag<Item>) - Method in class org.apache.spark.mllib.fpm.AssociationRules: Computes the association rules with confidence above minConfidence.
run(JavaRDD<FPGrowth.FreqItemset<Item>>) - Method in class org.apache.spark.mllib.fpm.AssociationRules: Java-friendly version of run.
run(RDD<Object>, ClassTag<Item>) - Method in class org.apache.spark.mllib.fpm.FPGrowth: Computes an FP-Growth model that contains frequent itemsets.
run(JavaRDD<Basket>) - Method in class org.apache.spark.mllib.fpm.FPGrowth: Java-friendly version of run.
run(RDD<Object[]>, ClassTag<Item>) - Method in class org.apache.spark.mllib.fpm.PrefixSpan: Finds the complete set of frequent sequential patterns in the input sequences of itemsets.
run(JavaRDD<Sequence>) - Method in class org.apache.spark.mllib.fpm.PrefixSpan: A Java-friendly version of run() that reads sequences from a JavaRDD and returns frequent sequences in a PrefixSpanModel.
run(RDD<Rating>) - Method in class org.apache.spark.mllib.recommendation.ALS: Run ALS with the configured parameters on an input RDD of Rating objects.
run(JavaRDD<Rating>) - Method in class org.apache.spark.mllib.recommendation.ALS: Java-friendly version of ALS.run.
run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm: Run the algorithm with the configured parameters on an input RDD of LabeledPoint entries.
run(RDD<LabeledPoint>, Vector) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm: Run the algorithm with the configured parameters on an input RDD of LabeledPoint entries starting from the initial weights provided.
run(RDD<Tuple3<Object, Object, Object>>) - Method in class org.apache.spark.mllib.regression.IsotonicRegression: Run IsotonicRegression algorithm to obtain isotonic regression model.
run(JavaRDD<Tuple3<Double, Double, Double>>) - Method in class org.apache.spark.mllib.regression.IsotonicRegression: Run pool adjacent violators algorithm to obtain isotonic regression model.
run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.DecisionTree: Method to train a decision tree model over an RDD
run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.GradientBoostedTrees: Method to train a gradient boosting model
run(JavaRDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.GradientBoostedTrees: Java-friendly API for org.apache.spark.mllib.tree.GradientBoostedTrees.run.
run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.RandomForest: Method to train a decision tree model over an RDD
run(SparkSession, SparkPlan) - Method in class org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand
run(SparkSession, SparkPlan) - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveDirCommand
run(SparkSession, SparkPlan) - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable: Inserts all the rows in the table into Hive.
run() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformationWriterThread
run() - Method in class org.apache.spark.util.SparkShutdownHook
runApproximateJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, ApproximateEvaluator<U, R>, long) - Method in class org.apache.spark.SparkContext: :: DeveloperApi :: Run a job that can return approximate results.
runId() - Method in interface org.apache.spark.sql.streaming.StreamingQuery: Returns the unique id of this run of the query.
runId() - Method in class org.apache.spark.sql.streaming.StreamingQueryListener.QueryStartedEvent
runId() - Method in class org.apache.spark.sql.streaming.StreamingQueryListener.QueryTerminatedEvent
runId() - Method in class org.apache.spark.sql.streaming.StreamingQueryProgress
runInNewThread(String, boolean, Function0<T>) - Static method in class org.apache.spark.util.ThreadUtils: Run a piece of code in a new thread and return the result.
runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Seq<Object>, Function2<Object, U, BoxedUnit>, ClassTag) - Method in class org.apache.spark.SparkContext: Run a function on a given set of partitions in an RDD and pass the results to the given handler function.
runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Seq<Object>, ClassTag) - Method in class org.apache.spark.SparkContext: Run a function on a given set of partitions in an RDD and return the results as an array.
runJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, ClassTag) - Method in class org.apache.spark.SparkContext: Run a function on a given set of partitions in an RDD and return the results as an array.
runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, ClassTag) - Method in class org.apache.spark.SparkContext: Run a job on all partitions in an RDD and return the results in an array.
runJob(RDD<T>, Function1<Iterator<T>, U>, ClassTag) - Method in class org.apache.spark.SparkContext: Run a job on all partitions in an RDD and return the results in an array.
runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Function2<Object, U, BoxedUnit>, ClassTag) - Method in class org.apache.spark.SparkContext: Run a job on all partitions in an RDD and pass the results to a handler function.
runJob(RDD<T>, Function1<Iterator<T>, U>, Function2<Object, U, BoxedUnit>, ClassTag) - Method in class org.apache.spark.SparkContext: Run a job on all partitions in an RDD and pass the results to a handler function.
runLBFGS(RDD<Tuple2<Object, Vector>>, Gradient, Updater, int, double, int, double, Vector) - Static method in class org.apache.spark.mllib.optimization.LBFGS: Run Limited-memory BFGS (L-BFGS) in parallel.
runMiniBatchSGD(RDD<Tuple2<Object, Vector>>, Gradient, Updater, double, int, double, double, Vector, double) - Static method in class org.apache.spark.mllib.optimization.GradientDescent: Run stochastic gradient descent (SGD) in parallel using mini batches.
runMiniBatchSGD(RDD<Tuple2<Object, Vector>>, Gradient, Updater, double, int, double, double, Vector) - Static method in class org.apache.spark.mllib.optimization.GradientDescent: Alias of runMiniBatchSGD with convergenceTol set to default value of 0.001.
running() - Method in class org.apache.spark.scheduler.TaskInfo
RUNNING() - Static method in class org.apache.spark.TaskState
runningTasks() - Method in interface org.apache.spark.scheduler.Schedulable
runParallelPersonalizedPageRank(Graph<VD, ED>, int, double, long[], ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.PageRank: Run Personalized PageRank for a fixed number of iterations, for a set of starting nodes in parallel.
runPreCanonicalized(Graph<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.TriangleCount
runSqlHive(String) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Runs a HiveQL command using Hive, returning the results as a list of strings.
runtime() - Method in class org.apache.spark.status.api.v1.ApplicationEnvironmentInfo
RuntimeConfig - Class in org.apache.spark.sql: Runtime configuration interface for Spark.
RuntimeInfo - Class in org.apache.spark.status.api.v1
RuntimePercentage - Class in org.apache.spark.scheduler
RuntimePercentage(double, Option<Object>, double) - Constructor for class org.apache.spark.scheduler.RuntimePercentage
runUntilConvergence(Graph<VD, ED>, double, double, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.PageRank: Run a dynamic version of PageRank returning a graph with vertex attributes containing the PageRank and edge attributes containing the normalized edge weight.
runUntilConvergenceWithOptions(Graph<VD, ED>, double, double, Option<Object>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.PageRank: Run a dynamic version of PageRank returning a graph with vertex attributes containing the PageRank and edge attributes containing the normalized edge weight.
runWithOptions(Graph<VD, ED>, int, double, Option<Object>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.PageRank: Run PageRank for a fixed number of iterations returning a graph with vertex attributes containing the PageRank and edge attributes the normalized edge weight.
runWithValidation(RDD<LabeledPoint>, RDD<LabeledPoint>, BoostingStrategy, long, String) - Static method in class org.apache.spark.ml.tree.impl.GradientBoostedTrees: Method to validate a gradient boosting model
runWithValidation(RDD<LabeledPoint>, RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.GradientBoostedTrees: Method to validate a gradient boosting model
runWithValidation(JavaRDD<LabeledPoint>, JavaRDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.GradientBoostedTrees: Java-friendly API for org.apache.spark.mllib.tree.GradientBoostedTrees.runWithValidation.
RUtils - Class in org.apache.spark.api.r
RUtils() - Constructor for class org.apache.spark.api.r.RUtils
RWrappers - Class in org.apache.spark.ml.r: This is the Scala stub of SparkR read.ml.
RWrappers() - Constructor for class org.apache.spark.ml.r.RWrappers
RWrapperUtils - Class in org.apache.spark.ml.r
RWrapperUtils() - Constructor for class org.apache.spark.ml.r.RWrapperUtils

S

s() - Method in class org.apache.spark.mllib.linalg.SingularValueDecomposition
safeCall(Function0<T>) - Method in interface org.apache.spark.security.CryptoStreamUtils.BaseErrorHandler
sameThread() - Static method in class org.apache.spark.util.ThreadUtils: An ExecutionContextExecutor that runs each task in the thread that invokes execute/submit.
sample(boolean, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a sampled subset of this RDD.
sample(boolean, Double, long) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a sampled subset of this RDD.
sample(boolean, double) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a sampled subset of this RDD.
sample(boolean, double, long) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a sampled subset of this RDD.
sample(boolean, double) - Method in class org.apache.spark.api.java.JavaRDD: Return a sampled subset of this RDD with a random seed.
sample(boolean, double, long) - Method in class org.apache.spark.api.java.JavaRDD: Return a sampled subset of this RDD, with a user-supplied seed.
sample(boolean, double, long) - Method in class org.apache.spark.rdd.RDD: Return a sampled subset of this RDD.
sample(double, long) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset by sampling a fraction of rows (without replacement), using a user-supplied seed.
sample(double) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset by sampling a fraction of rows (without replacement), using a random seed.
sample(boolean, double, long) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset by sampling a fraction of rows, using a user-supplied seed.
sample(boolean, double) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset by sampling a fraction of rows, using a random seed.
sample() - Method in class org.apache.spark.util.random.BernoulliCellSampler
sample() - Method in class org.apache.spark.util.random.BernoulliSampler
sample() - Method in class org.apache.spark.util.random.PoissonSampler
sample(Iterator<T>) - Method in class org.apache.spark.util.random.PoissonSampler
sample(Iterator<T>) - Method in interface org.apache.spark.util.random.RandomSampler: take a random sample
sample() - Method in interface org.apache.spark.util.random.RandomSampler: Whether to sample the next item or not.
sampleBy(String, Map<T, Object>, long) - Method in class org.apache.spark.sql.DataFrameStatFunctions: Returns a stratified sample without replacement based on the fraction given on each stratum.
sampleBy(String, Map<T, Double>, long) - Method in class org.apache.spark.sql.DataFrameStatFunctions: Returns a stratified sample without replacement based on the fraction given on each stratum.
sampleByKey(boolean, Map<K, Double>, long) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a subset of this RDD sampled by key (via stratified sampling).
sampleByKey(boolean, Map<K, Double>) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a subset of this RDD sampled by key (via stratified sampling).
sampleByKey(boolean, Map<K, Object>, long) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return a subset of this RDD sampled by key (via stratified sampling).
sampleByKeyExact(boolean, Map<K, Double>, long) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a subset of this RDD sampled by key (via stratified sampling) containing exactly math.ceil(numItems * samplingRate) for each stratum (group of pairs with the same key).
sampleByKeyExact(boolean, Map<K, Double>) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a subset of this RDD sampled by key (via stratified sampling) containing exactly math.ceil(numItems * samplingRate) for each stratum (group of pairs with the same key).
sampleByKeyExact(boolean, Map<K, Object>, long) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return a subset of this RDD sampled by key (via stratified sampling) containing exactly math.ceil(numItems * samplingRate) for each stratum (group of pairs with the same key).
SamplePathFilter - Class in org.apache.spark.ml.image: Filter that allows loading a fraction of HDFS files.
SamplePathFilter() - Constructor for class org.apache.spark.ml.image.SamplePathFilter
samplePointsPerPartitionHint() - Method in class org.apache.spark.RangePartitioner
sampleRatio() - Method in class org.apache.spark.ml.image.SamplePathFilter
sampleStdev() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Compute the sample standard deviation of this RDD's elements (which corrects for bias in estimating the standard deviation by dividing by N-1 instead of N).
sampleStdev() - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Compute the sample standard deviation of this RDD's elements (which corrects for bias in estimating the standard deviation by dividing by N-1 instead of N).
sampleStdev() - Method in class org.apache.spark.util.StatCounter: Return the sample standard deviation of the values, which corrects for bias in estimating the variance by dividing by N-1 instead of N.
sampleVariance() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Compute the sample variance of this RDD's elements (which corrects for bias in estimating the standard variance by dividing by N-1 instead of N).
sampleVariance() - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Compute the sample variance of this RDD's elements (which corrects for bias in estimating the variance by dividing by N-1 instead of N).
sampleVariance() - Method in class org.apache.spark.util.StatCounter: Return the sample variance, which corrects for bias in estimating the variance by dividing by N-1 instead of N.
SamplingUtils - Class in org.apache.spark.util.random
SamplingUtils() - Constructor for class org.apache.spark.util.random.SamplingUtils
satisfy(Distribution) - Method in interface org.apache.spark.sql.sources.v2.reader.partitioning.Partitioning: Returns true if this partitioning can satisfy the given distribution, which means Spark does not need to shuffle the output data of this data source for some certain operations.
save(String) - Method in interface org.apache.spark.ml.util.MLWritable: Saves this ML instance to the input path, a shortcut of write.save(path).
save(String) - Method in class org.apache.spark.ml.util.MLWriter: Saves the ML instances to the input path.
save(SparkContext, String, String, int, int, Vector, double, Option<Object>) - Method in class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$: Helper method for saving GLM classification model metadata and data.
save(SparkContext, String) - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
save(SparkContext, String) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
save(SparkContext, String, org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV1_0.Data) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV1_0$
save(SparkContext, String, org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV2_0.Data) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV2_0$
save(SparkContext, String) - Method in class org.apache.spark.mllib.classification.SVMModel
save(SparkContext, String) - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel
save(SparkContext, BisectingKMeansModel, String) - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel.SaveLoadV1_0$
save(SparkContext, BisectingKMeansModel, String) - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel.SaveLoadV2_0$
save(SparkContext, String) - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
save(SparkContext, String) - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
save(SparkContext, String) - Method in class org.apache.spark.mllib.clustering.KMeansModel
save(SparkContext, KMeansModel, String) - Method in class org.apache.spark.mllib.clustering.KMeansModel.SaveLoadV1_0$
save(SparkContext, KMeansModel, String) - Method in class org.apache.spark.mllib.clustering.KMeansModel.SaveLoadV2_0$
save(SparkContext, String) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
save(SparkContext, String) - Method in class org.apache.spark.mllib.clustering.PowerIterationClusteringModel
save(SparkContext, PowerIterationClusteringModel, String) - Method in class org.apache.spark.mllib.clustering.PowerIterationClusteringModel.SaveLoadV1_0$
save(SparkContext, String) - Method in class org.apache.spark.mllib.feature.ChiSqSelectorModel
save(SparkContext, ChiSqSelectorModel, String) - Method in class org.apache.spark.mllib.feature.ChiSqSelectorModel.SaveLoadV1_0$
save(SparkContext, String) - Method in class org.apache.spark.mllib.feature.Word2VecModel
save(SparkContext, String) - Method in class org.apache.spark.mllib.fpm.FPGrowthModel: Save this model to the given path.
save(FPGrowthModel<?>, String) - Method in class org.apache.spark.mllib.fpm.FPGrowthModel.SaveLoadV1_0$
save(SparkContext, String) - Method in class org.apache.spark.mllib.fpm.PrefixSpanModel: Save this model to the given path.
save(PrefixSpanModel<?>, String) - Method in class org.apache.spark.mllib.fpm.PrefixSpanModel.SaveLoadV1_0$
save(SparkContext, String) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel: Save this model to the given path.
save(MatrixFactorizationModel, String) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel.SaveLoadV1_0$: Saves a MatrixFactorizationModel, where user features are saved under data/users and product features are saved under data/products.
save(SparkContext, String, String, Vector, double) - Method in class org.apache.spark.mllib.regression.impl.GLMRegressionModel.SaveLoadV1_0$: Helper method for saving GLM regression model metadata and data.
save(SparkContext, String) - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
save(SparkContext, String) - Method in class org.apache.spark.mllib.regression.LassoModel
save(SparkContext, String) - Method in class org.apache.spark.mllib.regression.LinearRegressionModel
save(SparkContext, String) - Method in class org.apache.spark.mllib.regression.RidgeRegressionModel
save(SparkContext, String) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
save(SparkContext, String, DecisionTreeModel) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$
save(SparkContext, String) - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
save(SparkContext, String) - Method in class org.apache.spark.mllib.tree.model.RandomForestModel
save(SparkContext, String) - Method in interface org.apache.spark.mllib.util.Saveable: Save this model to the given path.
save(String) - Method in class org.apache.spark.sql.DataFrameWriter: Saves the content of the DataFrame at the specified path.
save() - Method in class org.apache.spark.sql.DataFrameWriter: Saves the content of the DataFrame as the specified table.
Saveable - Interface in org.apache.spark.mllib.util: :: DeveloperApi ::
saveAsHadoopDataset(JobConf) - Method in class org.apache.spark.api.java.JavaPairRDD: Output the RDD to any Hadoop-supported storage system, using a Hadoop JobConf object for that storage system.
saveAsHadoopDataset(JobConf) - Method in class org.apache.spark.rdd.PairRDDFunctions: Output the RDD to any Hadoop-supported storage system, using a Hadoop JobConf object for that storage system.
saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>, JobConf) - Method in class org.apache.spark.api.java.JavaPairRDD: Output the RDD to any Hadoop-supported file system.
saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.api.java.JavaPairRDD: Output the RDD to any Hadoop-supported file system.
saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>, Class<? extends CompressionCodec>) - Method in class org.apache.spark.api.java.JavaPairRDD: Output the RDD to any Hadoop-supported file system, compressing with the supplied codec.
saveAsHadoopFile(String, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat class supporting the key and value types K and V in this RDD.
saveAsHadoopFile(String, Class<? extends CompressionCodec>, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat class supporting the key and value types K and V in this RDD.
saveAsHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Class<? extends CompressionCodec>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat class supporting the key and value types K and V in this RDD.
saveAsHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, JobConf, Option<Class<? extends CompressionCodec>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat class supporting the key and value types K and V in this RDD.
saveAsHadoopFiles(String, String) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Save each RDD in this DStream as a Hadoop file.
saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Save each RDD in this DStream as a Hadoop file.
saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<F>, JobConf) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Save each RDD in this DStream as a Hadoop file.
saveAsHadoopFiles(String, String, ClassTag<F>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Save each RDD in this DStream as a Hadoop file.
saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, JobConf) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Save each RDD in this DStream as a Hadoop file.
SaveAsHiveFile - Interface in org.apache.spark.sql.hive.execution
saveAsHiveFile(SparkSession, SparkPlan, Configuration, org.apache.spark.sql.hive.HiveShim.ShimFileSinkDesc, String, Map<Map<String, String>, String>, Seq<Attribute>) - Method in interface org.apache.spark.sql.hive.execution.SaveAsHiveFile
saveAsLibSVMFile(RDD<LabeledPoint>, String) - Static method in class org.apache.spark.mllib.util.MLUtils: Save labeled data in LIBSVM format.
saveAsNewAPIHadoopDataset(Configuration) - Method in class org.apache.spark.api.java.JavaPairRDD: Output the RDD to any Hadoop-supported storage system, using a Configuration object for that storage system.
saveAsNewAPIHadoopDataset(Configuration) - Method in class org.apache.spark.rdd.PairRDDFunctions: Output the RDD to any Hadoop-supported storage system with new Hadoop API, using a Hadoop Configuration object for that storage system.
saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<F>, Configuration) - Method in class org.apache.spark.api.java.JavaPairRDD: Output the RDD to any Hadoop-supported file system.
saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.api.java.JavaPairRDD: Output the RDD to any Hadoop-supported file system.
saveAsNewAPIHadoopFile(String, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Output the RDD to any Hadoop-supported file system, using a new Hadoop API OutputFormat (mapreduce.OutputFormat) object supporting the key and value types K and V in this RDD.
saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Configuration) - Method in class org.apache.spark.rdd.PairRDDFunctions: Output the RDD to any Hadoop-supported file system, using a new Hadoop API OutputFormat (mapreduce.OutputFormat) object supporting the key and value types K and V in this RDD.
saveAsNewAPIHadoopFiles(String, String) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Save each RDD in this DStream as a Hadoop file.
saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Save each RDD in this DStream as a Hadoop file.
saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<F>, Configuration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Save each RDD in this DStream as a Hadoop file.
saveAsNewAPIHadoopFiles(String, String, ClassTag<F>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Save each RDD in this DStream as a Hadoop file.
saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Configuration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Save each RDD in this DStream as a Hadoop file.
saveAsObjectFile(String) - Method in interface org.apache.spark.api.java.JavaRDDLike: Save this RDD as a SequenceFile of serialized objects.
saveAsObjectFile(String) - Method in class org.apache.spark.rdd.RDD: Save this RDD as a SequenceFile of serialized objects.
saveAsObjectFiles(String, String) - Method in class org.apache.spark.streaming.dstream.DStream: Save each RDD in this DStream as a Sequence file of serialized objects.
saveAsSequenceFile(String, Option<Class<? extends CompressionCodec>>) - Method in class org.apache.spark.rdd.SequenceFileRDDFunctions: Output the RDD as a Hadoop SequenceFile using the Writable types we infer from the RDD's key and value types.
saveAsTable(String) - Method in class org.apache.spark.sql.DataFrameWriter: Saves the content of the DataFrame as the specified table.
saveAsTextFile(String) - Method in interface org.apache.spark.api.java.JavaRDDLike: Save this RDD as a text file, using string representations of elements.
saveAsTextFile(String, Class<? extends CompressionCodec>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Save this RDD as a compressed text file, using string representations of elements.
saveAsTextFile(String) - Method in class org.apache.spark.rdd.RDD: Save this RDD as a text file, using string representations of elements.
saveAsTextFile(String, Class<? extends CompressionCodec>) - Method in class org.apache.spark.rdd.RDD: Save this RDD as a compressed text file, using string representations of elements.
saveAsTextFiles(String, String) - Method in class org.apache.spark.streaming.dstream.DStream: Save each RDD in this DStream as at text file, using string representation of elements.
savedTasks() - Method in class org.apache.spark.status.LiveStage
saveImpl(Params, PipelineStage[], SparkContext, String) - Method in class org.apache.spark.ml.Pipeline.SharedReadWrite$: Save metadata and stages for a Pipeline or PipelineModel - save metadata to path/metadata - save stages to stages/IDX_UID
saveImpl(M, String, SparkSession, JsonAST.JObject) - Static method in class org.apache.spark.ml.tree.EnsembleModelReadWrite: Helper method for saving a tree ensemble to disk.
SaveLoadV1_0$() - Constructor for class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$
SaveLoadV1_0$() - Constructor for class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV1_0$
SaveLoadV1_0$() - Constructor for class org.apache.spark.mllib.clustering.BisectingKMeansModel.SaveLoadV1_0$
SaveLoadV1_0$() - Constructor for class org.apache.spark.mllib.clustering.KMeansModel.SaveLoadV1_0$
SaveLoadV1_0$() - Constructor for class org.apache.spark.mllib.clustering.PowerIterationClusteringModel.SaveLoadV1_0$
SaveLoadV1_0$() - Constructor for class org.apache.spark.mllib.feature.ChiSqSelectorModel.SaveLoadV1_0$
SaveLoadV1_0$() - Constructor for class org.apache.spark.mllib.fpm.FPGrowthModel.SaveLoadV1_0$
SaveLoadV1_0$() - Constructor for class org.apache.spark.mllib.fpm.PrefixSpanModel.SaveLoadV1_0$
SaveLoadV1_0$() - Constructor for class org.apache.spark.mllib.recommendation.MatrixFactorizationModel.SaveLoadV1_0$
SaveLoadV1_0$() - Constructor for class org.apache.spark.mllib.regression.impl.GLMRegressionModel.SaveLoadV1_0$
SaveLoadV1_0$() - Constructor for class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$
SaveLoadV2_0$() - Constructor for class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV2_0$
SaveLoadV2_0$() - Constructor for class org.apache.spark.mllib.clustering.BisectingKMeansModel.SaveLoadV2_0$
SaveLoadV2_0$() - Constructor for class org.apache.spark.mllib.clustering.KMeansModel.SaveLoadV2_0$
SaveMode - Enum in org.apache.spark.sql: SaveMode is used to specify the expected behavior of saving a DataFrame to a data source.
sc() - Method in class org.apache.spark.api.java.JavaSparkContext
sc() - Method in interface org.apache.spark.ml.util.BaseReadWrite: Returns the underlying `SparkContext`.
sc() - Method in class org.apache.spark.sql.SQLImplicits.StringToColumn
scal(double, Vector) - Static method in class org.apache.spark.ml.linalg.BLAS: x = a * x
scal(double, Vector) - Static method in class org.apache.spark.mllib.linalg.BLAS: x = a * x
scalaBoolean() - Static method in class org.apache.spark.sql.Encoders: An encoder for Scala's primitive boolean type.
scalaByte() - Static method in class org.apache.spark.sql.Encoders: An encoder for Scala's primitive byte type.
scalaDouble() - Static method in class org.apache.spark.sql.Encoders: An encoder for Scala's primitive double type.
scalaFloat() - Static method in class org.apache.spark.sql.Encoders: An encoder for Scala's primitive float type.
scalaInt() - Static method in class org.apache.spark.sql.Encoders: An encoder for Scala's primitive int type.
scalaIntToJavaLong(DStream<Object>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
scalaLong() - Static method in class org.apache.spark.sql.Encoders: An encoder for Scala's primitive long type.
scalaShort() - Static method in class org.apache.spark.sql.Encoders: An encoder for Scala's primitive short type.
scalaToJavaLong(JavaPairDStream<K, Object>, ClassTag<K>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
scalaVersion() - Method in class org.apache.spark.status.api.v1.RuntimeInfo
scale() - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
scale() - Method in class org.apache.spark.ml.regression.LinearRegressionModel
scale() - Method in class org.apache.spark.mllib.random.GammaGenerator
scale() - Method in class org.apache.spark.sql.types.Decimal
scale() - Method in class org.apache.spark.sql.types.DecimalType
scalingVec() - Method in class org.apache.spark.ml.feature.ElementwiseProduct: the vector to multiply with input vectors
scalingVec() - Method in class org.apache.spark.mllib.feature.ElementwiseProduct
Schedulable - Interface in org.apache.spark.scheduler: An interface for schedulable entities.
SchedulableBuilder - Interface in org.apache.spark.scheduler: An interface to build Schedulable tree buildPools: build the tree nodes(pools) addTaskSetManager: build the leaf nodes(TaskSetManagers)
schedulableQueue() - Method in interface org.apache.spark.scheduler.Schedulable
SCHEDULED() - Static method in class org.apache.spark.streaming.scheduler.ReceiverState
SCHEDULER_DELAY() - Static method in class org.apache.spark.status.TaskIndexNames
SCHEDULER_DELAY() - Static method in class org.apache.spark.ui.jobs.TaskDetailsClassNames
SCHEDULER_DELAY() - Static method in class org.apache.spark.ui.ToolTips
SchedulerBackend - Interface in org.apache.spark.scheduler: A backend interface for scheduling systems that allows plugging in different ones under TaskSchedulerImpl.
SchedulerBackendUtils - Class in org.apache.spark.scheduler.cluster
SchedulerBackendUtils() - Constructor for class org.apache.spark.scheduler.cluster.SchedulerBackendUtils
schedulerDelay() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
schedulerDelay(TaskData) - Static method in class org.apache.spark.status.AppStatusUtils
schedulerDelay(long, long, long, long, long, long) - Static method in class org.apache.spark.status.AppStatusUtils
SchedulerPool - Class in org.apache.spark.status
SchedulerPool(String) - Constructor for class org.apache.spark.status.SchedulerPool
SchedulingAlgorithm - Interface in org.apache.spark.scheduler: An interface for sort algorithm FIFO: FIFO algorithm between TaskSetManagers FS: FS algorithm between Pools, and FIFO or FS within Pools
schedulingDelay() - Method in class org.apache.spark.status.api.v1.streaming.BatchInfo
schedulingDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo: Time taken for the first job of this batch to start processing from the time this batch was submitted to the streaming scheduler.
schedulingMode() - Method in interface org.apache.spark.scheduler.Schedulable
SchedulingMode - Class in org.apache.spark.scheduler: "FAIR" and "FIFO" determines which policy is used to order tasks amongst a Schedulable's sub-queues "NONE" is used when the a Schedulable has no sub-queues.
SchedulingMode() - Constructor for class org.apache.spark.scheduler.SchedulingMode
schedulingMode() - Method in interface org.apache.spark.scheduler.TaskScheduler
schedulingPool() - Method in class org.apache.spark.status.api.v1.StageData
schedulingPool() - Method in class org.apache.spark.status.LiveStage
schema(StructType) - Method in class org.apache.spark.sql.DataFrameReader: Specifies the input schema.
schema(String) - Method in class org.apache.spark.sql.DataFrameReader: Specifies the schema by using the input DDL-formatted string.
schema() - Method in class org.apache.spark.sql.Dataset: Returns the schema of this Dataset.
schema() - Method in interface org.apache.spark.sql.Encoder: Returns the schema of encoding this type of object as a Row.
schema() - Method in interface org.apache.spark.sql.Row: Schema for the row.
schema() - Method in class org.apache.spark.sql.sources.BaseRelation
schema(StructType) - Method in class org.apache.spark.sql.streaming.DataStreamReader: Specifies the input schema.
schema(String) - Method in class org.apache.spark.sql.streaming.DataStreamReader: Specifies the schema by using the input DDL-formatted string.
schema_of_json(String) - Static method in class org.apache.spark.sql.functions: Parses a JSON string and infers its schema in DDL format.
schema_of_json(Column) - Static method in class org.apache.spark.sql.functions: Parses a JSON string and infers its schema in DDL format.
schemaLess() - Method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
SchemaRelationProvider - Interface in org.apache.spark.sql.sources: Implemented by objects that produce relations for a specific kind of data source with a given schema.
SchemaUtils - Class in org.apache.spark.ml.util: Utils for handling schemas.
SchemaUtils() - Constructor for class org.apache.spark.ml.util.SchemaUtils
SchemaUtils - Class in org.apache.spark.sql.util: Utils for handling schemas.
SchemaUtils() - Constructor for class org.apache.spark.sql.util.SchemaUtils
scope() - Method in class org.apache.spark.storage.RDDInfo
scoreAndLabels() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
scratch() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
script() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformationExec
Scripts() - Method in interface org.apache.spark.sql.hive.HiveStrategies
Scripts() - Constructor for class org.apache.spark.sql.hive.HiveStrategies.Scripts
Scripts$() - Constructor for class org.apache.spark.sql.hive.HiveStrategies.Scripts$
ScriptTransformationExec - Class in org.apache.spark.sql.hive.execution: Transforms the input by forking and running the specified script.
ScriptTransformationExec(Seq<Expression>, String, Seq<Attribute>, SparkPlan, HiveScriptIOSchema) - Constructor for class org.apache.spark.sql.hive.execution.ScriptTransformationExec
ScriptTransformationWriterThread - Class in org.apache.spark.sql.hive.execution
ScriptTransformationWriterThread(Iterator<InternalRow>, Seq<DataType>, org.apache.spark.sql.catalyst.expressions.Projection, AbstractSerDe, ObjectInspector, HiveScriptIOSchema, OutputStream, Process, org.apache.spark.util.CircularBuffer, TaskContext, Configuration) - Constructor for class org.apache.spark.sql.hive.execution.ScriptTransformationWriterThread
second(Column) - Static method in class org.apache.spark.sql.functions: Extracts the seconds as an integer from a given date/timestamp/string.
seconds() - Static method in class org.apache.spark.scheduler.StatsReportListener
seconds(long) - Static method in class org.apache.spark.streaming.Durations
Seconds - Class in org.apache.spark.streaming: Helper object that creates instance of Duration representing a given number of seconds.
Seconds() - Constructor for class org.apache.spark.streaming.Seconds
securityManager() - Method in class org.apache.spark.SparkEnv
securityManager() - Method in interface org.apache.spark.status.api.v1.UIRoot
seed() - Method in interface org.apache.spark.ml.param.shared.HasSeed: Param for random seed.
seedParam() - Static method in class org.apache.spark.ml.image.SamplePathFilter
select(Column...) - Method in class org.apache.spark.sql.Dataset: Selects a set of column based expressions.
select(String, String...) - Method in class org.apache.spark.sql.Dataset: Selects a set of columns.
select(Seq<Column>) - Method in class org.apache.spark.sql.Dataset: Selects a set of column based expressions.
select(String, Seq<String>) - Method in class org.apache.spark.sql.Dataset: Selects a set of columns.
select(TypedColumn<T, U1>) - Method in class org.apache.spark.sql.Dataset: :: Experimental :: Returns a new Dataset by computing the given Column expression for each element.
select(TypedColumn<T, U1>, TypedColumn<T, U2>) - Method in class org.apache.spark.sql.Dataset: :: Experimental :: Returns a new Dataset by computing the given Column expressions for each element.
select(TypedColumn<T, U1>, TypedColumn<T, U2>, TypedColumn<T, U3>) - Method in class org.apache.spark.sql.Dataset: :: Experimental :: Returns a new Dataset by computing the given Column expressions for each element.
select(TypedColumn<T, U1>, TypedColumn<T, U2>, TypedColumn<T, U3>, TypedColumn<T, U4>) - Method in class org.apache.spark.sql.Dataset: :: Experimental :: Returns a new Dataset by computing the given Column expressions for each element.
select(TypedColumn<T, U1>, TypedColumn<T, U2>, TypedColumn<T, U3>, TypedColumn<T, U4>, TypedColumn<T, U5>) - Method in class org.apache.spark.sql.Dataset: :: Experimental :: Returns a new Dataset by computing the given Column expressions for each element.
selectedFeatures() - Method in class org.apache.spark.ml.feature.ChiSqSelectorModel: list of indices to select (filter).
selectedFeatures() - Method in class org.apache.spark.mllib.feature.ChiSqSelectorModel
selectExpr(String...) - Method in class org.apache.spark.sql.Dataset: Selects a set of SQL expressions.
selectExpr(Seq<String>) - Method in class org.apache.spark.sql.Dataset: Selects a set of SQL expressions.
selectorType() - Method in interface org.apache.spark.ml.feature.ChiSqSelectorParams: The selector type of the ChisqSelector.
selectorType() - Method in class org.apache.spark.mllib.feature.ChiSqSelector
self() - Method in interface org.apache.spark.rpc.RpcEndpoint: The RpcEndpointRef of this RpcEndpoint.
sendData(String, Seq<Object>) - Method in interface org.apache.spark.streaming.kinesis.KinesisDataGenerator: Sends the data to Kinesis and returns the metadata for everything that has been sent.
sender() - Method in class org.apache.spark.storage.BlockManagerMessages.RegisterBlockManager
senderAddress() - Method in interface org.apache.spark.rpc.RpcCallContext: The sender of this message.
sendFailure(Throwable) - Method in interface org.apache.spark.rpc.RpcCallContext: Report a failure to the sender.
sendToDst(A) - Method in class org.apache.spark.graphx.EdgeContext: Sends a message to the destination vertex.
sendToDst(A) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
sendToSrc(A) - Method in class org.apache.spark.graphx.EdgeContext: Sends a message to the source vertex.
sendToSrc(A) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
sendWith(TransportClient) - Method in interface org.apache.spark.rpc.netty.OutboxMessage
seqToString(Seq<T>, Function1<T, String>) - Static method in class org.apache.spark.internal.config.ConfigHelpers
sequence() - Method in class org.apache.spark.mllib.fpm.PrefixSpan.FreqSequence
sequence(Column, Column, Column) - Static method in class org.apache.spark.sql.functions: Generate a sequence of integers from start to stop, incrementing by step.
sequence(Column, Column) - Static method in class org.apache.spark.sql.functions: Generate a sequence of integers from start to stop, incrementing by 1 if start is less than or equal to stop, otherwise -1.
sequenceCol() - Method in class org.apache.spark.ml.fpm.PrefixSpan: Param for the name of the sequence column in dataset (default "sequence"), rows with nulls in this column are ignored.
sequenceFile(String, Class<K>, Class<V>, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Get an RDD for a Hadoop SequenceFile with given key and value types.
sequenceFile(String, Class<K>, Class<V>) - Method in class org.apache.spark.api.java.JavaSparkContext: Get an RDD for a Hadoop SequenceFile.
sequenceFile(String, Class<K>, Class<V>, int) - Method in class org.apache.spark.SparkContext: Get an RDD for a Hadoop SequenceFile with given key and value types.
sequenceFile(String, Class<K>, Class<V>) - Method in class org.apache.spark.SparkContext: Get an RDD for a Hadoop SequenceFile with given key and value types.
sequenceFile(String, int, ClassTag<K>, ClassTag<V>, Function0<WritableConverter<K>>, Function0<WritableConverter<V>>) - Method in class org.apache.spark.SparkContext: Version of sequenceFile() for types implicitly convertible to Writables through a WritableConverter.
SequenceFileRDDFunctions<K,V> - Class in org.apache.spark.rdd: Extra functions available on RDDs of (key, value) pairs to create a Hadoop SequenceFile, through an implicit conversion.
SequenceFileRDDFunctions(RDD<Tuple2<K, V>>, Class<? extends Writable>, Class<? extends Writable>, Function1<K, Writable>, ClassTag<K>, Function1<V, Writable>, ClassTag<V>) - Constructor for class org.apache.spark.rdd.SequenceFileRDDFunctions
SER_TIME() - Static method in class org.apache.spark.status.TaskIndexNames
SerDe - Class in org.apache.spark.api.r: Utility functions to serialize, deserialize objects to / from R
SerDe() - Constructor for class org.apache.spark.api.r.SerDe
SERDE() - Static method in class org.apache.spark.sql.hive.execution.HiveOptions
serde() - Method in class org.apache.spark.sql.hive.execution.HiveOptions
serdeProperties() - Method in class org.apache.spark.sql.hive.execution.HiveOptions
SerializableMapWrapper(Map<A, B>) - Constructor for class org.apache.spark.api.java.JavaUtils.SerializableMapWrapper
SerializableWritable<T extends org.apache.hadoop.io.Writable> - Class in org.apache.spark
SerializableWritable(T) - Constructor for class org.apache.spark.SerializableWritable
SerializationDebugger - Class in org.apache.spark.serializer
SerializationDebugger() - Constructor for class org.apache.spark.serializer.SerializationDebugger
SerializationDebugger.ObjectStreamClassMethods - Class in org.apache.spark.serializer: An implicit class that allows us to call private methods of ObjectStreamClass.
SerializationDebugger.ObjectStreamClassMethods$ - Class in org.apache.spark.serializer
SerializationFormats - Class in org.apache.spark.api.r
SerializationFormats() - Constructor for class org.apache.spark.api.r.SerializationFormats
SerializationStream - Class in org.apache.spark.serializer: :: DeveloperApi :: A stream for writing serialized objects.
SerializationStream() - Constructor for class org.apache.spark.serializer.SerializationStream
serializationStream() - Method in class org.apache.spark.storage.memory.SerializedValuesHolder
serialize(Vector) - Method in class org.apache.spark.mllib.linalg.VectorUDT
serialize(T, ClassTag<T>) - Method in class org.apache.spark.serializer.DummySerializerInstance
serialize(T, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializerInstance
serialize(T) - Static method in class org.apache.spark.util.Utils: Serialize an object using Java serialization
SERIALIZED_R_DATA_SCHEMA() - Static method in class org.apache.spark.sql.api.r.SQLUtils
serializedData() - Method in class org.apache.spark.scheduler.local.StatusUpdate
serializedMapStatus(org.apache.spark.broadcast.BroadcastManager, boolean, int) - Method in class org.apache.spark.ShuffleStatus: Serializes the mapStatuses array into an efficient compressed format.
SerializedMemoryEntry<T> - Class in org.apache.spark.storage.memory
SerializedMemoryEntry(org.apache.spark.util.io.ChunkedByteBuffer, MemoryMode, ClassTag<T>) - Constructor for class org.apache.spark.storage.memory.SerializedMemoryEntry
SerializedValuesHolder<T> - Class in org.apache.spark.storage.memory: A holder for storing the serialized values.
SerializedValuesHolder(BlockId, int, ClassTag<T>, MemoryMode, org.apache.spark.serializer.SerializerManager) - Constructor for class org.apache.spark.storage.memory.SerializedValuesHolder
Serializer - Class in org.apache.spark.serializer: :: DeveloperApi :: A serializer.
Serializer() - Constructor for class org.apache.spark.serializer.Serializer
serializer() - Method in class org.apache.spark.ShuffleDependency
serializer() - Method in class org.apache.spark.SparkEnv
SerializerInstance - Class in org.apache.spark.serializer: :: DeveloperApi :: An instance of a serializer, for use by one thread at a time.
SerializerInstance() - Constructor for class org.apache.spark.serializer.SerializerInstance
serializerManager() - Method in class org.apache.spark.SparkEnv
serializeStream(OutputStream) - Method in class org.apache.spark.serializer.DummySerializerInstance
serializeStream(OutputStream) - Method in class org.apache.spark.serializer.SerializerInstance
serializeViaNestedStream(OutputStream, SerializerInstance, Function1<SerializationStream, BoxedUnit>) - Static method in class org.apache.spark.util.Utils: Serialize via nested stream using specific serializer
servletContext() - Method in interface org.apache.spark.status.api.v1.ApiRequestContext
ServletParams(Function1<HttpServletRequest, T>, String, Function1<T, String>) - Constructor for class org.apache.spark.ui.JettyUtils.ServletParams
ServletParams$() - Constructor for class org.apache.spark.ui.JettyUtils.ServletParams$
session(SparkSession) - Static method in class org.apache.spark.ml.r.RWrappers
session(SparkSession) - Method in interface org.apache.spark.ml.util.BaseReadWrite: Sets the Spark Session to use for saving/loading.
session(SparkSession) - Method in class org.apache.spark.ml.util.GeneralMLWriter
session(SparkSession) - Method in class org.apache.spark.ml.util.MLReader
session(SparkSession) - Method in class org.apache.spark.ml.util.MLWriter
sessionCatalog() - Method in class org.apache.spark.sql.hive.RelationConversions
SessionConfigSupport - Interface in org.apache.spark.sql.sources.v2: A mix-in interface for DataSourceV2.
sessionState() - Method in class org.apache.spark.sql.SparkSession: State isolated across sessions, including SQL configurations, temporary tables, registered functions, and everything else that accepts a SQLConf.
set(long, long, int, int, VD, VD, ED) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
Set() - Static method in class org.apache.spark.metrics.sink.StatsdMetricType
set(Param<T>, T) - Method in interface org.apache.spark.ml.param.Params: Sets a parameter in the embedded param map.
set(String, Object) - Method in interface org.apache.spark.ml.param.Params: Sets a parameter (by name) in the embedded param map.
set(ParamPair<?>) - Method in interface org.apache.spark.ml.param.Params: Sets a parameter in the embedded param map.
set(String, long, long) - Static method in class org.apache.spark.rdd.InputFileBlockHolder: Sets the thread-local input block.
set(String, String) - Method in class org.apache.spark.SparkConf: Set a configuration variable.
set(SparkEnv) - Static method in class org.apache.spark.SparkEnv
set(String, String) - Method in class org.apache.spark.sql.RuntimeConfig: Sets the given Spark runtime configuration property.
set(String, boolean) - Method in class org.apache.spark.sql.RuntimeConfig: Sets the given Spark runtime configuration property.
set(String, long) - Method in class org.apache.spark.sql.RuntimeConfig: Sets the given Spark runtime configuration property.
set(long) - Method in class org.apache.spark.sql.types.Decimal: Set this Decimal to the given Long.
set(int) - Method in class org.apache.spark.sql.types.Decimal: Set this Decimal to the given Int.
set(long, int, int) - Method in class org.apache.spark.sql.types.Decimal: Set this Decimal to the given unscaled Long, with a given precision and scale.
set(BigDecimal, int, int) - Method in class org.apache.spark.sql.types.Decimal: Set this Decimal to the given BigDecimal value, with a given precision and scale.
set(BigDecimal) - Method in class org.apache.spark.sql.types.Decimal: Set this Decimal to the given BigDecimal value, inheriting its precision and scale.
set(BigInteger) - Method in class org.apache.spark.sql.types.Decimal: If the value is not in the range of long, convert it to BigDecimal and the precision and scale are based on the converted value.
set(Decimal) - Method in class org.apache.spark.sql.types.Decimal: Set this Decimal to the given Decimal value.
setActive(SQLContext) - Static method in class org.apache.spark.sql.SQLContext: Deprecated.
Use SparkSession.setActiveSession instead. Since 2.0.0.
setActiveSession(SparkSession) - Static method in class org.apache.spark.sql.SparkSession: Changes the SparkSession that will be returned in this thread and its children when SparkSession.getOrCreate() is called.
setAggregationDepth(int) - Method in class org.apache.spark.ml.classification.LinearSVC: Suggested depth for treeAggregate (greater than or equal to 2).
setAggregationDepth(int) - Method in class org.apache.spark.ml.classification.LogisticRegression: Suggested depth for treeAggregate (greater than or equal to 2).
setAggregationDepth(int) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression: Suggested depth for treeAggregate (greater than or equal to 2).
setAggregationDepth(int) - Method in class org.apache.spark.ml.regression.LinearRegression: Suggested depth for treeAggregate (greater than or equal to 2).
setAggregator(Aggregator<K, V, C>) - Method in class org.apache.spark.rdd.ShuffledRDD: Set aggregator for RDD's shuffle.
setAlgo(String) - Method in class org.apache.spark.mllib.tree.configuration.Strategy: Sets Algorithm using a String.
setAll(Traversable<Tuple2<String, String>>) - Method in class org.apache.spark.SparkConf: Set multiple parameters together
setAlpha(double) - Method in class org.apache.spark.ml.recommendation.ALS
setAlpha(Vector) - Method in class org.apache.spark.mllib.clustering.LDA: Alias for setDocConcentration()
setAlpha(double) - Method in class org.apache.spark.mllib.clustering.LDA: Alias for setDocConcentration()
setAlpha(double) - Method in class org.apache.spark.mllib.recommendation.ALS: Sets the constant used in computing confidence in implicit ALS.
setAppName(String) - Method in class org.apache.spark.launcher.AbstractLauncher: Set the application name.
setAppName(String) - Method in class org.apache.spark.launcher.SparkLauncher
setAppName(String) - Method in class org.apache.spark.SparkConf: Set a name for your application.
setAppResource(String) - Method in class org.apache.spark.launcher.AbstractLauncher: Set the main application resource.
setAppResource(String) - Method in class org.apache.spark.launcher.SparkLauncher
setBandwidth(double) - Method in class org.apache.spark.mllib.stat.KernelDensity: Sets the bandwidth (standard deviation) of the Gaussian kernel (default: 1.0).
setBeta(double) - Method in class org.apache.spark.mllib.clustering.LDA: Alias for setTopicConcentration()
setBinary(boolean) - Method in class org.apache.spark.ml.feature.CountVectorizer
setBinary(boolean) - Method in class org.apache.spark.ml.feature.CountVectorizerModel
setBinary(boolean) - Method in class org.apache.spark.ml.feature.HashingTF
setBinary(boolean) - Method in class org.apache.spark.mllib.feature.HashingTF: If true, term frequency vector will be binary such that non-zero term counts will be set to 1 (default: false)
setBlocks(int) - Method in class org.apache.spark.mllib.recommendation.ALS: Set the number of blocks for both user blocks and product blocks to parallelize the computation into; pass -1 for an auto-configured number of blocks.
setBlockSize(int) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier: Sets the value of param blockSize.
setBucketLength(double) - Method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSH
setCacheNodeIds(boolean) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
setCacheNodeIds(boolean) - Method in class org.apache.spark.ml.classification.GBTClassifier
setCacheNodeIds(boolean) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
setCacheNodeIds(boolean) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
setCacheNodeIds(boolean) - Method in class org.apache.spark.ml.regression.GBTRegressor
setCacheNodeIds(boolean) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
setCacheNodeIds(boolean) - Method in interface org.apache.spark.ml.tree.DecisionTreeParams: Deprecated.
This method is deprecated and will be removed in 3.0.0.
setCallSite(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Pass-through to SparkContext.setCallSite.
setCallSite(String) - Method in class org.apache.spark.SparkContext: Set the thread-local property for overriding the call sites of actions and RDDs.
setCaseSensitive(boolean) - Method in class org.apache.spark.ml.feature.StopWordsRemover
setCategoricalCols(String[]) - Method in class org.apache.spark.ml.feature.FeatureHasher
setCategoricalFeaturesInfo(Map<Integer, Integer>) - Method in class org.apache.spark.mllib.tree.configuration.Strategy: Sets categoricalFeaturesInfo using a Java Map.
setCensorCol(String) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
setCheckpointDir(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Set the directory under which RDDs are going to be checkpointed.
setCheckpointDir(String) - Method in class org.apache.spark.SparkContext: Set the directory under which RDDs are going to be checkpointed.
setCheckpointInterval(int) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier: Specifies how often to checkpoint the cached node IDs.
setCheckpointInterval(int) - Method in class org.apache.spark.ml.classification.GBTClassifier: Specifies how often to checkpoint the cached node IDs.
setCheckpointInterval(int) - Method in class org.apache.spark.ml.classification.RandomForestClassifier: Specifies how often to checkpoint the cached node IDs.
setCheckpointInterval(int) - Method in class org.apache.spark.ml.clustering.LDA
setCheckpointInterval(int) - Method in class org.apache.spark.ml.recommendation.ALS
setCheckpointInterval(int) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor: Specifies how often to checkpoint the cached node IDs.
setCheckpointInterval(int) - Method in class org.apache.spark.ml.regression.GBTRegressor: Specifies how often to checkpoint the cached node IDs.
setCheckpointInterval(int) - Method in class org.apache.spark.ml.regression.RandomForestRegressor: Specifies how often to checkpoint the cached node IDs.
setCheckpointInterval(int) - Method in interface org.apache.spark.ml.tree.DecisionTreeParams: Deprecated.
This method is deprecated and will be removed in 3.0.0.
setCheckpointInterval(int) - Method in class org.apache.spark.mllib.clustering.LDA: Parameter for set checkpoint interval (greater than or equal to 1) or disable checkpoint (-1).
setCheckpointInterval(int) - Method in class org.apache.spark.mllib.recommendation.ALS: :: DeveloperApi :: Set period (in iterations) between checkpoints (default = 10).
setCheckpointInterval(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
setClassifier(Classifier<?, ?, ?>) - Method in class org.apache.spark.ml.classification.OneVsRest
setColdStartStrategy(String) - Method in class org.apache.spark.ml.recommendation.ALS
setColdStartStrategy(String) - Method in class org.apache.spark.ml.recommendation.ALSModel
setCollectSubModels(boolean) - Method in class org.apache.spark.ml.tuning.CrossValidator: Whether to collect submodels when fitting.
setCollectSubModels(boolean) - Method in class org.apache.spark.ml.tuning.TrainValidationSplit: Whether to collect submodels when fitting.
setConf(Configuration) - Method in interface org.apache.spark.input.Configurable
setConf(String, String) - Method in class org.apache.spark.launcher.AbstractLauncher: Set a single configuration value for the application.
setConf(String, String) - Method in class org.apache.spark.launcher.SparkLauncher
setConf(Configuration) - Method in class org.apache.spark.ml.image.SamplePathFilter
setConf(Properties) - Method in class org.apache.spark.sql.SQLContext: Set Spark SQL configuration properties.
setConf(String, String) - Method in class org.apache.spark.sql.SQLContext: Set the given Spark SQL configuration property.
setConfig(String, String) - Static method in class org.apache.spark.launcher.SparkLauncher: Set a configuration value for the launcher library.
setConvergenceTol(double) - Method in class org.apache.spark.mllib.clustering.GaussianMixture: Set the largest change in log-likelihood at which convergence is considered to have occurred.
setConvergenceTol(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent: Set the convergence tolerance.
setConvergenceTol(double) - Method in class org.apache.spark.mllib.optimization.LBFGS: Set the convergence tolerance of iterations for L-BFGS.
setConvergenceTol(double) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD: Set the convergence tolerance.
setCurrentDatabase(String) - Method in class org.apache.spark.sql.catalog.Catalog: Sets the current default database in this session.
setCurrentDatabase(String) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Sets the name of current database.
setCustomHostname(String) - Static method in class org.apache.spark.util.Utils: Allow setting a custom host name because when we run on Mesos we need to use the same hostname it reports to the master.
setDAGScheduler(DAGScheduler) - Method in interface org.apache.spark.scheduler.TaskScheduler
setDecayFactor(double) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans: Set the forgetfulness of the previous centroids.
setDefault(Param<T>, T) - Method in interface org.apache.spark.ml.param.Params: Sets a default value for a param.
setDefault(Seq<ParamPair<?>>) - Method in interface org.apache.spark.ml.param.Params: Sets default values for a list of params.
setDefaultClassLoader(ClassLoader) - Method in class org.apache.spark.serializer.Serializer: Sets a class loader for the serializer to use in deserialization.
setDefaultSession(SparkSession) - Static method in class org.apache.spark.sql.SparkSession: Sets the default SparkSession that is returned by the builder.
setDegree(int) - Method in class org.apache.spark.ml.feature.PolynomialExpansion
setDeployMode(String) - Method in class org.apache.spark.launcher.AbstractLauncher: Set the deploy mode for the application.
setDeployMode(String) - Method in class org.apache.spark.launcher.SparkLauncher
setDistanceMeasure(String) - Method in class org.apache.spark.ml.clustering.BisectingKMeans
setDistanceMeasure(String) - Method in class org.apache.spark.ml.clustering.KMeans
setDistanceMeasure(String) - Method in class org.apache.spark.ml.evaluation.ClusteringEvaluator
setDistanceMeasure(String) - Method in class org.apache.spark.mllib.clustering.BisectingKMeans: Set the distance suite used by the algorithm.
setDistanceMeasure(String) - Method in class org.apache.spark.mllib.clustering.KMeans: Set the distance suite used by the algorithm.
setDocConcentration(double[]) - Method in class org.apache.spark.ml.clustering.LDA
setDocConcentration(double) - Method in class org.apache.spark.ml.clustering.LDA
setDocConcentration(Vector) - Method in class org.apache.spark.mllib.clustering.LDA: Concentration parameter (commonly named "alpha") for the prior placed on documents' distributions over topics ("theta").
setDocConcentration(double) - Method in class org.apache.spark.mllib.clustering.LDA: Replicates a Double docConcentration to create a symmetric prior.
setDropLast(boolean) - Method in class org.apache.spark.ml.feature.OneHotEncoder: Deprecated.
setDropLast(boolean) - Method in class org.apache.spark.ml.feature.OneHotEncoderEstimator
setDropLast(boolean) - Method in class org.apache.spark.ml.feature.OneHotEncoderModel
setDstCol(String) - Method in class org.apache.spark.ml.clustering.PowerIterationClustering
setElasticNetParam(double) - Method in class org.apache.spark.ml.classification.LogisticRegression: Set the ElasticNet mixing parameter.
setElasticNetParam(double) - Method in class org.apache.spark.ml.regression.LinearRegression: Set the ElasticNet mixing parameter.
setEpsilon(double) - Method in class org.apache.spark.ml.regression.LinearRegression: Sets the value of param epsilon.
setEpsilon(double) - Method in class org.apache.spark.mllib.clustering.KMeans: Set the distance threshold within which we've consider centers to have converged.
setError(PrintStream) - Method in interface org.apache.spark.sql.hive.client.HiveClient
setEstimator(Estimator<?>) - Method in class org.apache.spark.ml.tuning.CrossValidator
setEstimator(Estimator<?>) - Method in class org.apache.spark.ml.tuning.TrainValidationSplit
setEstimatorParamMaps(ParamMap[]) - Method in class org.apache.spark.ml.tuning.CrossValidator
setEstimatorParamMaps(ParamMap[]) - Method in class org.apache.spark.ml.tuning.TrainValidationSplit
setEvaluator(Evaluator) - Method in class org.apache.spark.ml.tuning.CrossValidator
setEvaluator(Evaluator) - Method in class org.apache.spark.ml.tuning.TrainValidationSplit
setExecutorEnv(String, String) - Method in class org.apache.spark.SparkConf: Set an environment variable to be used when launching executors for this application.
setExecutorEnv(Seq<Tuple2<String, String>>) - Method in class org.apache.spark.SparkConf: Set multiple environment variables to be used when launching executors.
setExecutorEnv(Tuple2<String, String>[]) - Method in class org.apache.spark.SparkConf: Set multiple environment variables to be used when launching executors.
setFamily(String) - Method in class org.apache.spark.ml.classification.LogisticRegression: Sets the value of param family.
setFamily(String) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression: Sets the value of param family.
setFdr(double) - Method in class org.apache.spark.ml.feature.ChiSqSelector
setFdr(double) - Method in class org.apache.spark.mllib.feature.ChiSqSelector
setFeatureIndex(int) - Method in class org.apache.spark.ml.regression.IsotonicRegression
setFeatureIndex(int) - Method in class org.apache.spark.ml.regression.IsotonicRegressionModel
setFeaturesCol(String) - Method in class org.apache.spark.ml.classification.OneVsRest
setFeaturesCol(String) - Method in class org.apache.spark.ml.classification.OneVsRestModel
setFeaturesCol(String) - Method in class org.apache.spark.ml.clustering.BisectingKMeans
setFeaturesCol(String) - Method in class org.apache.spark.ml.clustering.BisectingKMeansModel
setFeaturesCol(String) - Method in class org.apache.spark.ml.clustering.GaussianMixture
setFeaturesCol(String) - Method in class org.apache.spark.ml.clustering.GaussianMixtureModel
setFeaturesCol(String) - Method in class org.apache.spark.ml.clustering.KMeans
setFeaturesCol(String) - Method in class org.apache.spark.ml.clustering.KMeansModel
setFeaturesCol(String) - Method in class org.apache.spark.ml.clustering.LDA: The features for LDA should be a Vector representing the word counts in a document.
setFeaturesCol(String) - Method in class org.apache.spark.ml.clustering.LDAModel: The features for LDA should be a Vector representing the word counts in a document.
setFeaturesCol(String) - Method in class org.apache.spark.ml.evaluation.ClusteringEvaluator
setFeaturesCol(String) - Method in class org.apache.spark.ml.feature.ChiSqSelector
setFeaturesCol(String) - Method in class org.apache.spark.ml.feature.ChiSqSelectorModel
setFeaturesCol(String) - Method in class org.apache.spark.ml.feature.RFormula
setFeaturesCol(String) - Method in class org.apache.spark.ml.PredictionModel
setFeaturesCol(String) - Method in class org.apache.spark.ml.Predictor
setFeaturesCol(String) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
setFeaturesCol(String) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
setFeaturesCol(String) - Method in class org.apache.spark.ml.regression.IsotonicRegression
setFeaturesCol(String) - Method in class org.apache.spark.ml.regression.IsotonicRegressionModel
setFeatureSubsetStrategy(String) - Method in class org.apache.spark.ml.classification.GBTClassifier
setFeatureSubsetStrategy(String) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
setFeatureSubsetStrategy(String) - Method in class org.apache.spark.ml.regression.GBTRegressor
setFeatureSubsetStrategy(String) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
setFeatureSubsetStrategy(String) - Method in interface org.apache.spark.ml.tree.TreeEnsembleParams: Deprecated.
This method is deprecated and will be removed in 3.0.0
setFinalRDDStorageLevel(StorageLevel) - Method in class org.apache.spark.mllib.recommendation.ALS: :: DeveloperApi :: Sets storage level for final RDDs (user/product used in MatrixFactorizationModel).
setFinalStorageLevel(String) - Method in class org.apache.spark.ml.recommendation.ALS
setFitIntercept(boolean) - Method in class org.apache.spark.ml.classification.LinearSVC: Whether to fit an intercept term.
setFitIntercept(boolean) - Method in class org.apache.spark.ml.classification.LogisticRegression: Whether to fit an intercept term.
setFitIntercept(boolean) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression: Set if we should fit the intercept Default is true.
setFitIntercept(boolean) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression: Sets if we should fit the intercept.
setFitIntercept(boolean) - Method in class org.apache.spark.ml.regression.LinearRegression: Set if we should fit the intercept.
setForceIndexLabel(boolean) - Method in class org.apache.spark.ml.feature.RFormula
setFormula(String) - Method in class org.apache.spark.ml.feature.RFormula: Sets the formula to use for this transformer.
setFpr(double) - Method in class org.apache.spark.ml.feature.ChiSqSelector
setFpr(double) - Method in class org.apache.spark.mllib.feature.ChiSqSelector
setFwe(double) - Method in class org.apache.spark.ml.feature.ChiSqSelector
setFwe(double) - Method in class org.apache.spark.mllib.feature.ChiSqSelector
setGaps(boolean) - Method in class org.apache.spark.ml.feature.RegexTokenizer
setGradient(Gradient) - Method in class org.apache.spark.mllib.optimization.GradientDescent: Set the gradient function (of the loss function of one single data example) to be used for SGD.
setGradient(Gradient) - Method in class org.apache.spark.mllib.optimization.LBFGS: Set the gradient function (of the loss function of one single data example) to be used for L-BFGS.
setHalfLife(double, String) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans: Set the half life and time unit ("batches" or "points").
setHandleInvalid(String) - Method in class org.apache.spark.ml.feature.Bucketizer
setHandleInvalid(String) - Method in class org.apache.spark.ml.feature.OneHotEncoderEstimator
setHandleInvalid(String) - Method in class org.apache.spark.ml.feature.OneHotEncoderModel
setHandleInvalid(String) - Method in class org.apache.spark.ml.feature.QuantileDiscretizer
setHandleInvalid(String) - Method in class org.apache.spark.ml.feature.RFormula
setHandleInvalid(String) - Method in class org.apache.spark.ml.feature.StringIndexer
setHandleInvalid(String) - Method in class org.apache.spark.ml.feature.StringIndexerModel
setHandleInvalid(String) - Method in class org.apache.spark.ml.feature.VectorAssembler
setHandleInvalid(String) - Method in class org.apache.spark.ml.feature.VectorIndexer
setHandleInvalid(String) - Method in class org.apache.spark.ml.feature.VectorSizeHint
setHashAlgorithm(String) - Method in class org.apache.spark.mllib.feature.HashingTF: Set the hash algorithm used when mapping term to integer.
setIfMissing(String, String) - Method in class org.apache.spark.SparkConf: Set a parameter if it isn't already configured
setImplicitPrefs(boolean) - Method in class org.apache.spark.ml.recommendation.ALS
setImplicitPrefs(boolean) - Method in class org.apache.spark.mllib.recommendation.ALS: Sets whether to use implicit preference.
setImpurity(String) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
setImpurity(String) - Method in class org.apache.spark.ml.classification.GBTClassifier: The impurity setting is ignored for GBT models.
setImpurity(String) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
setImpurity(String) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
setImpurity(String) - Method in class org.apache.spark.ml.regression.GBTRegressor: The impurity setting is ignored for GBT models.
setImpurity(String) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
setImpurity(String) - Method in interface org.apache.spark.ml.tree.TreeClassifierParams: Deprecated.
This method is deprecated and will be removed in 3.0.0.
setImpurity(String) - Method in interface org.apache.spark.ml.tree.TreeRegressorParams: Deprecated.
This method is deprecated and will be removed in 3.0.0.
setImpurity(Impurity) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
setIndices(int[]) - Method in class org.apache.spark.ml.feature.VectorSlicer
setInfo(PrintStream) - Method in interface org.apache.spark.sql.hive.client.HiveClient
setInitialCenters(Vector[], double[]) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans: Specify initial centers directly.
setInitializationMode(String) - Method in class org.apache.spark.mllib.clustering.KMeans: Set the initialization algorithm.
setInitializationMode(String) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering: Set the initialization mode.
setInitializationSteps(int) - Method in class org.apache.spark.mllib.clustering.KMeans: Set the number of steps for the k-means|| initialization mode.
setInitialModel(GaussianMixtureModel) - Method in class org.apache.spark.mllib.clustering.GaussianMixture: Set the initial GMM starting point, bypassing the random initialization.
setInitialModel(KMeansModel) - Method in class org.apache.spark.mllib.clustering.KMeans: Set the initial starting point, bypassing the random initialization or k-means|| The condition model.k == this.k must be met, failure results in an IllegalArgumentException.
setInitialWeights(Vector) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier: Sets the value of param initialWeights.
setInitialWeights(Vector) - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD: Set the initial weights.
setInitialWeights(Vector) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD: Set the initial weights.
setInitMode(String) - Method in class org.apache.spark.ml.clustering.KMeans
setInitMode(String) - Method in class org.apache.spark.ml.clustering.PowerIterationClustering
setInitSteps(int) - Method in class org.apache.spark.ml.clustering.KMeans
setInputCol(String) - Method in class org.apache.spark.ml.feature.Binarizer
setInputCol(String) - Method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSH
setInputCol(String) - Method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSHModel
setInputCol(String) - Method in class org.apache.spark.ml.feature.Bucketizer
setInputCol(String) - Method in class org.apache.spark.ml.feature.CountVectorizer
setInputCol(String) - Method in class org.apache.spark.ml.feature.CountVectorizerModel
setInputCol(String) - Method in class org.apache.spark.ml.feature.HashingTF
setInputCol(String) - Method in class org.apache.spark.ml.feature.IDF
setInputCol(String) - Method in class org.apache.spark.ml.feature.IDFModel
setInputCol(String) - Method in class org.apache.spark.ml.feature.IndexToString
setInputCol(String) - Method in class org.apache.spark.ml.feature.MaxAbsScaler
setInputCol(String) - Method in class org.apache.spark.ml.feature.MaxAbsScalerModel
setInputCol(String) - Method in class org.apache.spark.ml.feature.MinHashLSH
setInputCol(String) - Method in class org.apache.spark.ml.feature.MinHashLSHModel
setInputCol(String) - Method in class org.apache.spark.ml.feature.MinMaxScaler
setInputCol(String) - Method in class org.apache.spark.ml.feature.MinMaxScalerModel
setInputCol(String) - Method in class org.apache.spark.ml.feature.OneHotEncoder: Deprecated.
setInputCol(String) - Method in class org.apache.spark.ml.feature.PCA
setInputCol(String) - Method in class org.apache.spark.ml.feature.PCAModel
setInputCol(String) - Method in class org.apache.spark.ml.feature.QuantileDiscretizer
setInputCol(String) - Method in class org.apache.spark.ml.feature.StandardScaler
setInputCol(String) - Method in class org.apache.spark.ml.feature.StandardScalerModel
setInputCol(String) - Method in class org.apache.spark.ml.feature.StopWordsRemover
setInputCol(String) - Method in class org.apache.spark.ml.feature.StringIndexer
setInputCol(String) - Method in class org.apache.spark.ml.feature.StringIndexerModel
setInputCol(String) - Method in class org.apache.spark.ml.feature.VectorIndexer
setInputCol(String) - Method in class org.apache.spark.ml.feature.VectorIndexerModel
setInputCol(String) - Method in class org.apache.spark.ml.feature.VectorSizeHint
setInputCol(String) - Method in class org.apache.spark.ml.feature.VectorSlicer
setInputCol(String) - Method in class org.apache.spark.ml.feature.Word2Vec
setInputCol(String) - Method in class org.apache.spark.ml.feature.Word2VecModel
setInputCol(String) - Method in class org.apache.spark.ml.UnaryTransformer
setInputCols(String[]) - Method in class org.apache.spark.ml.feature.Bucketizer
setInputCols(Seq<String>) - Method in class org.apache.spark.ml.feature.FeatureHasher
setInputCols(String[]) - Method in class org.apache.spark.ml.feature.FeatureHasher
setInputCols(String[]) - Method in class org.apache.spark.ml.feature.Imputer
setInputCols(String[]) - Method in class org.apache.spark.ml.feature.ImputerModel
setInputCols(String[]) - Method in class org.apache.spark.ml.feature.Interaction
setInputCols(String[]) - Method in class org.apache.spark.ml.feature.OneHotEncoderEstimator
setInputCols(String[]) - Method in class org.apache.spark.ml.feature.OneHotEncoderModel
setInputCols(String[]) - Method in class org.apache.spark.ml.feature.QuantileDiscretizer
setInputCols(String[]) - Method in class org.apache.spark.ml.feature.VectorAssembler
setIntercept(boolean) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm: Set if the algorithm should add an intercept.
setIntermediateRDDStorageLevel(StorageLevel) - Method in class org.apache.spark.mllib.recommendation.ALS: :: DeveloperApi :: Sets storage level for intermediate RDDs (user/product in/out links).
setIntermediateStorageLevel(String) - Method in class org.apache.spark.ml.recommendation.ALS
setInverse(boolean) - Method in class org.apache.spark.ml.feature.DCT
setIsotonic(boolean) - Method in class org.apache.spark.ml.regression.IsotonicRegression
setIsotonic(boolean) - Method in class org.apache.spark.mllib.regression.IsotonicRegression: Sets the isotonic parameter.
setItemCol(String) - Method in class org.apache.spark.ml.recommendation.ALS
setItemCol(String) - Method in class org.apache.spark.ml.recommendation.ALSModel
setItemsCol(String) - Method in class org.apache.spark.ml.fpm.FPGrowth
setItemsCol(String) - Method in class org.apache.spark.ml.fpm.FPGrowthModel
setIterations(int) - Method in class org.apache.spark.mllib.recommendation.ALS: Set the number of iterations to run.
setJars(Seq<String>) - Method in class org.apache.spark.SparkConf: Set JAR files to distribute to the cluster.
setJars(String[]) - Method in class org.apache.spark.SparkConf: Set JAR files to distribute to the cluster.
setJavaHome(String) - Method in class org.apache.spark.launcher.SparkLauncher: Set a custom JAVA_HOME for launching the Spark application.
setJobDescription(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Set a human readable description of the current job.
setJobDescription(String) - Method in class org.apache.spark.SparkContext: Set a human readable description of the current job.
setJobGroup(String, String, boolean) - Method in class org.apache.spark.api.java.JavaSparkContext: Assigns a group ID to all the jobs started by this thread until the group ID is set to a different value or cleared.
setJobGroup(String, String) - Method in class org.apache.spark.api.java.JavaSparkContext: Assigns a group ID to all the jobs started by this thread until the group ID is set to a different value or cleared.
setJobGroup(String, String, boolean) - Method in class org.apache.spark.SparkContext: Assigns a group ID to all the jobs started by this thread until the group ID is set to a different value or cleared.
setK(int) - Method in class org.apache.spark.ml.clustering.BisectingKMeans
setK(int) - Method in class org.apache.spark.ml.clustering.GaussianMixture
setK(int) - Method in class org.apache.spark.ml.clustering.KMeans
setK(int) - Method in class org.apache.spark.ml.clustering.LDA
setK(int) - Method in class org.apache.spark.ml.clustering.PowerIterationClustering
setK(int) - Method in class org.apache.spark.ml.feature.PCA
setK(int) - Method in class org.apache.spark.mllib.clustering.BisectingKMeans: Sets the desired number of leaf clusters (default: 4).
setK(int) - Method in class org.apache.spark.mllib.clustering.GaussianMixture: Set the number of Gaussians in the mixture model.
setK(int) - Method in class org.apache.spark.mllib.clustering.KMeans: Set the number of clusters to create (k).
setK(int) - Method in class org.apache.spark.mllib.clustering.LDA: Set the number of topics to infer, i.e., the number of soft cluster centers.
setK(int) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering: Set the number of clusters.
setK(int) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans: Set the number of clusters.
setKappa(double) - Method in class org.apache.spark.mllib.clustering.OnlineLDAOptimizer: Learning rate: exponential decay rate---should be between (0.5, 1.0] to guarantee asymptotic convergence.
setKeepLastCheckpoint(boolean) - Method in class org.apache.spark.ml.clustering.LDA
setKeepLastCheckpoint(boolean) - Method in class org.apache.spark.mllib.clustering.EMLDAOptimizer: If using checkpointing, this indicates whether to keep the last checkpoint (vs clean up).
setKeyOrdering(Ordering<K>) - Method in class org.apache.spark.rdd.ShuffledRDD: Set key ordering for RDD's shuffle.
setLabelCol(String) - Method in class org.apache.spark.ml.classification.OneVsRest
setLabelCol(String) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
setLabelCol(String) - Method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
setLabelCol(String) - Method in class org.apache.spark.ml.evaluation.RegressionEvaluator
setLabelCol(String) - Method in class org.apache.spark.ml.feature.ChiSqSelector
setLabelCol(String) - Method in class org.apache.spark.ml.feature.RFormula
setLabelCol(String) - Method in class org.apache.spark.ml.Predictor
setLabelCol(String) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
setLabelCol(String) - Method in class org.apache.spark.ml.regression.IsotonicRegression
setLabels(String[]) - Method in class org.apache.spark.ml.feature.IndexToString
setLambda(double) - Method in class org.apache.spark.mllib.classification.NaiveBayes: Set the smoothing parameter.
setLambda(double) - Method in class org.apache.spark.mllib.recommendation.ALS: Set the regularization parameter, lambda.
setLayers(int[]) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier: Sets the value of param layers.
setLearningDecay(double) - Method in class org.apache.spark.ml.clustering.LDA
setLearningOffset(double) - Method in class org.apache.spark.ml.clustering.LDA
setLearningRate(double) - Method in class org.apache.spark.mllib.feature.Word2Vec: Sets initial learning rate (default: 0.025).
setLearningRate(double) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
setLink(String) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression: Sets the value of param link.
setLinkPower(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression: Sets the value of param linkPower.
setLinkPredictionCol(String) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression: Sets the link prediction (linear predictor) column name.
setLinkPredictionCol(String) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionModel: Sets the link prediction (linear predictor) column name.
setLocale(String) - Method in class org.apache.spark.ml.feature.StopWordsRemover
setLocalProperty(String, String) - Method in class org.apache.spark.api.java.JavaSparkContext: Set a local property that affects jobs submitted from this thread, and all child threads, such as the Spark fair scheduler pool.
setLocalProperty(String, String) - Method in class org.apache.spark.SparkContext: Set a local property that affects jobs submitted from this thread, such as the Spark fair scheduler pool.
setLogLevel(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Control our logLevel.
setLogLevel(String) - Method in class org.apache.spark.SparkContext: Control our logLevel.
setLogLevel(Level) - Static method in class org.apache.spark.util.Utils: configure a new log4j level
setLoss(String) - Method in class org.apache.spark.ml.regression.LinearRegression: Sets the value of param loss.
setLoss(Loss) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
setLossType(String) - Method in class org.apache.spark.ml.classification.GBTClassifier
setLossType(String) - Method in class org.apache.spark.ml.regression.GBTRegressor
setLowerBoundsOnCoefficients(Matrix) - Method in class org.apache.spark.ml.classification.LogisticRegression: Set the lower bounds on coefficients if fitting under bound constrained optimization.
setLowerBoundsOnIntercepts(Vector) - Method in class org.apache.spark.ml.classification.LogisticRegression: Set the lower bounds on intercepts if fitting under bound constrained optimization.
setMainClass(String) - Method in class org.apache.spark.launcher.AbstractLauncher: Sets the application class name for Java/Scala applications.
setMainClass(String) - Method in class org.apache.spark.launcher.SparkLauncher
setMapSideCombine(boolean) - Method in class org.apache.spark.rdd.ShuffledRDD: Set mapSideCombine flag for RDD's shuffle.
setMaster(String) - Method in class org.apache.spark.launcher.AbstractLauncher: Set the Spark master for the application.
setMaster(String) - Method in class org.apache.spark.launcher.SparkLauncher
setMaster(String) - Method in class org.apache.spark.SparkConf: The master URL to connect to, such as "local" to run locally with one thread, "local[4]" to run locally with 4 cores, or "spark://master:7077" to run on a Spark standalone cluster.
setMax(double) - Method in class org.apache.spark.ml.feature.MinMaxScaler
setMax(double) - Method in class org.apache.spark.ml.feature.MinMaxScalerModel
setMaxBins(int) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
setMaxBins(int) - Method in class org.apache.spark.ml.classification.GBTClassifier
setMaxBins(int) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
setMaxBins(int) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
setMaxBins(int) - Method in class org.apache.spark.ml.regression.GBTRegressor
setMaxBins(int) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
setMaxBins(int) - Method in interface org.apache.spark.ml.tree.DecisionTreeParams: Deprecated.
This method is deprecated and will be removed in 3.0.0.
setMaxBins(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
setMaxCategories(int) - Method in class org.apache.spark.ml.feature.VectorIndexer
setMaxDepth(int) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
setMaxDepth(int) - Method in class org.apache.spark.ml.classification.GBTClassifier
setMaxDepth(int) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
setMaxDepth(int) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
setMaxDepth(int) - Method in class org.apache.spark.ml.regression.GBTRegressor
setMaxDepth(int) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
setMaxDepth(int) - Method in interface org.apache.spark.ml.tree.DecisionTreeParams: Deprecated.
This method is deprecated and will be removed in 3.0.0.
setMaxDepth(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
setMaxDF(double) - Method in class org.apache.spark.ml.feature.CountVectorizer
setMaxIter(int) - Method in class org.apache.spark.ml.classification.GBTClassifier
setMaxIter(int) - Method in class org.apache.spark.ml.classification.LinearSVC: Set the maximum number of iterations.
setMaxIter(int) - Method in class org.apache.spark.ml.classification.LogisticRegression: Set the maximum number of iterations.
setMaxIter(int) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier: Set the maximum number of iterations.
setMaxIter(int) - Method in class org.apache.spark.ml.clustering.BisectingKMeans
setMaxIter(int) - Method in class org.apache.spark.ml.clustering.GaussianMixture
setMaxIter(int) - Method in class org.apache.spark.ml.clustering.KMeans
setMaxIter(int) - Method in class org.apache.spark.ml.clustering.LDA
setMaxIter(int) - Method in class org.apache.spark.ml.clustering.PowerIterationClustering
setMaxIter(int) - Method in class org.apache.spark.ml.feature.Word2Vec
setMaxIter(int) - Method in class org.apache.spark.ml.recommendation.ALS
setMaxIter(int) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression: Set the maximum number of iterations.
setMaxIter(int) - Method in class org.apache.spark.ml.regression.GBTRegressor
setMaxIter(int) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression: Sets the maximum number of iterations (applicable for solver "irls").
setMaxIter(int) - Method in class org.apache.spark.ml.regression.LinearRegression: Set the maximum number of iterations.
setMaxIter(int) - Method in interface org.apache.spark.ml.tree.GBTParams: Deprecated.
This method is deprecated and will be removed in 3.0.0.
setMaxIterations(int) - Method in class org.apache.spark.mllib.clustering.BisectingKMeans: Sets the max number of k-means iterations to split clusters (default: 20).
setMaxIterations(int) - Method in class org.apache.spark.mllib.clustering.GaussianMixture: Set the maximum number of iterations allowed.
setMaxIterations(int) - Method in class org.apache.spark.mllib.clustering.KMeans: Set maximum number of iterations allowed.
setMaxIterations(int) - Method in class org.apache.spark.mllib.clustering.LDA: Set the maximum number of iterations allowed.
setMaxIterations(int) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering: Set maximum number of iterations of the power iteration loop
setMaxLocalProjDBSize(long) - Method in class org.apache.spark.ml.fpm.PrefixSpan
setMaxLocalProjDBSize(long) - Method in class org.apache.spark.mllib.fpm.PrefixSpan: Sets the maximum number of items (including delimiters used in the internal storage format) allowed in a projected database before local processing (default: 32000000L).
setMaxMemoryInMB(int) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
setMaxMemoryInMB(int) - Method in class org.apache.spark.ml.classification.GBTClassifier
setMaxMemoryInMB(int) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
setMaxMemoryInMB(int) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
setMaxMemoryInMB(int) - Method in class org.apache.spark.ml.regression.GBTRegressor
setMaxMemoryInMB(int) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
setMaxMemoryInMB(int) - Method in interface org.apache.spark.ml.tree.DecisionTreeParams: Deprecated.
This method is deprecated and will be removed in 3.0.0.
setMaxMemoryInMB(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
setMaxPatternLength(int) - Method in class org.apache.spark.ml.fpm.PrefixSpan
setMaxPatternLength(int) - Method in class org.apache.spark.mllib.fpm.PrefixSpan: Sets maximal pattern length (default: 10).
setMaxSentenceLength(int) - Method in class org.apache.spark.ml.feature.Word2Vec
setMaxSentenceLength(int) - Method in class org.apache.spark.mllib.feature.Word2Vec: Sets the maximum length (in words) of each sentence in the input data.
setMetricName(String) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
setMetricName(String) - Method in class org.apache.spark.ml.evaluation.ClusteringEvaluator
setMetricName(String) - Method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
setMetricName(String) - Method in class org.apache.spark.ml.evaluation.RegressionEvaluator
setMin(double) - Method in class org.apache.spark.ml.feature.MinMaxScaler
setMin(double) - Method in class org.apache.spark.ml.feature.MinMaxScalerModel
setMinConfidence(double) - Method in class org.apache.spark.ml.fpm.FPGrowth
setMinConfidence(double) - Method in class org.apache.spark.ml.fpm.FPGrowthModel
setMinConfidence(double) - Method in class org.apache.spark.mllib.fpm.AssociationRules: Sets the minimal confidence (default: 0.8).
setMinCount(int) - Method in class org.apache.spark.ml.feature.Word2Vec
setMinCount(int) - Method in class org.apache.spark.mllib.feature.Word2Vec: Sets minCount, the minimum number of times a token must appear to be included in the word2vec model's vocabulary (default: 5).
setMinDF(double) - Method in class org.apache.spark.ml.feature.CountVectorizer
setMinDivisibleClusterSize(double) - Method in class org.apache.spark.ml.clustering.BisectingKMeans
setMinDivisibleClusterSize(double) - Method in class org.apache.spark.mllib.clustering.BisectingKMeans: Sets the minimum number of points (if greater than or equal to 1.0) or the minimum proportion of points (if less than 1.0) of a divisible cluster (default: 1).
setMinDocFreq(int) - Method in class org.apache.spark.ml.feature.IDF
setMiniBatchFraction(double) - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD: Set the fraction of each batch to use for updates.
setMiniBatchFraction(double) - Method in class org.apache.spark.mllib.clustering.OnlineLDAOptimizer: Mini-batch fraction in (0, 1], which sets the fraction of document sampled and used in each iteration.
setMiniBatchFraction(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent: Set fraction of data to be used for each SGD iteration.
setMiniBatchFraction(double) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD: Set the fraction of each batch to use for updates.
setMinInfoGain(double) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
setMinInfoGain(double) - Method in class org.apache.spark.ml.classification.GBTClassifier
setMinInfoGain(double) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
setMinInfoGain(double) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
setMinInfoGain(double) - Method in class org.apache.spark.ml.regression.GBTRegressor
setMinInfoGain(double) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
setMinInfoGain(double) - Method in interface org.apache.spark.ml.tree.DecisionTreeParams: Deprecated.
This method is deprecated and will be removed in 3.0.0.
setMinInfoGain(double) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
setMinInstancesPerNode(int) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
setMinInstancesPerNode(int) - Method in class org.apache.spark.ml.classification.GBTClassifier
setMinInstancesPerNode(int) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
setMinInstancesPerNode(int) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
setMinInstancesPerNode(int) - Method in class org.apache.spark.ml.regression.GBTRegressor
setMinInstancesPerNode(int) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
setMinInstancesPerNode(int) - Method in interface org.apache.spark.ml.tree.DecisionTreeParams: Deprecated.
This method is deprecated and will be removed in 3.0.0.
setMinInstancesPerNode(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
setMinSupport(double) - Method in class org.apache.spark.ml.fpm.FPGrowth
setMinSupport(double) - Method in class org.apache.spark.ml.fpm.PrefixSpan
setMinSupport(double) - Method in class org.apache.spark.mllib.fpm.FPGrowth: Sets the minimal support level (default: 0.3).
setMinSupport(double) - Method in class org.apache.spark.mllib.fpm.PrefixSpan: Sets the minimal support level (default: 0.1).
setMinTF(double) - Method in class org.apache.spark.ml.feature.CountVectorizer
setMinTF(double) - Method in class org.apache.spark.ml.feature.CountVectorizerModel
setMinTokenLength(int) - Method in class org.apache.spark.ml.feature.RegexTokenizer
setMissingValue(double) - Method in class org.apache.spark.ml.feature.Imputer
setModelType(String) - Method in class org.apache.spark.ml.classification.NaiveBayes: Set the model type using a string (case-sensitive).
setModelType(String) - Method in class org.apache.spark.mllib.classification.NaiveBayes: Set the model type using a string (case-sensitive).
setN(int) - Method in class org.apache.spark.ml.feature.NGram
setName(String) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Assign a name to this RDD
setName(String) - Method in class org.apache.spark.api.java.JavaPairRDD: Assign a name to this RDD
setName(String) - Method in class org.apache.spark.api.java.JavaRDD: Assign a name to this RDD
setName(String) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
setName(String) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
setName(String) - Method in class org.apache.spark.rdd.RDD: Assign a name to this RDD
setNames(String[]) - Method in class org.apache.spark.ml.feature.VectorSlicer
setNonnegative(boolean) - Method in class org.apache.spark.ml.recommendation.ALS
setNonnegative(boolean) - Method in class org.apache.spark.mllib.recommendation.ALS: Set whether the least-squares problems solved at each iteration should have nonnegativity constraints.
setNullAt(int) - Method in class org.apache.spark.sql.vectorized.ColumnarArray
setNullAt(int) - Method in class org.apache.spark.sql.vectorized.ColumnarRow
setNumBlocks(int) - Method in class org.apache.spark.ml.recommendation.ALS: Sets both numUserBlocks and numItemBlocks to the specific value.
setNumBuckets(int) - Method in class org.apache.spark.ml.feature.QuantileDiscretizer
setNumBucketsArray(int[]) - Method in class org.apache.spark.ml.feature.QuantileDiscretizer
setNumClasses(int) - Method in class org.apache.spark.mllib.classification.LogisticRegressionWithLBFGS: Set the number of possible outcomes for k classes classification problem in Multinomial Logistic Regression.
setNumClasses(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
setNumCorrections(int) - Method in class org.apache.spark.mllib.optimization.LBFGS: Set the number of corrections used in the LBFGS update.
setNumFeatures(int) - Method in class org.apache.spark.ml.feature.FeatureHasher
setNumFeatures(int) - Method in class org.apache.spark.ml.feature.HashingTF
setNumFolds(int) - Method in class org.apache.spark.ml.tuning.CrossValidator
setNumHashTables(int) - Method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSH
setNumHashTables(int) - Method in class org.apache.spark.ml.feature.MinHashLSH
setNumItemBlocks(int) - Method in class org.apache.spark.ml.recommendation.ALS
setNumIterations(int) - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD: Set the number of iterations of gradient descent to run per update.
setNumIterations(int) - Method in class org.apache.spark.mllib.feature.Word2Vec: Sets number of iterations (default: 1), which should be smaller than or equal to number of partitions.
setNumIterations(int) - Method in class org.apache.spark.mllib.optimization.GradientDescent: Set the number of iterations for SGD.
setNumIterations(int) - Method in class org.apache.spark.mllib.optimization.LBFGS: Set the maximal number of iterations for L-BFGS.
setNumIterations(int) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD: Set the number of iterations of gradient descent to run per update.
setNumIterations(int) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
setNumPartitions(int) - Method in class org.apache.spark.ml.feature.Word2Vec
setNumPartitions(int) - Method in class org.apache.spark.ml.fpm.FPGrowth
setNumPartitions(int) - Method in class org.apache.spark.mllib.feature.Word2Vec: Sets number of partitions (default: 1).
setNumPartitions(int) - Method in class org.apache.spark.mllib.fpm.FPGrowth: Sets the number of partitions used by parallel FP-growth (default: same as input data).
setNumRows(int) - Method in class org.apache.spark.sql.vectorized.ColumnarBatch: Sets the number of rows in this batch.
setNumTopFeatures(int) - Method in class org.apache.spark.ml.feature.ChiSqSelector
setNumTopFeatures(int) - Method in class org.apache.spark.mllib.feature.ChiSqSelector
setNumTrees(int) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
setNumTrees(int) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
setNumTrees(int) - Method in interface org.apache.spark.ml.tree.RandomForestParams: Deprecated.
This method is deprecated and will be removed in 3.0.0.
setNumUserBlocks(int) - Method in class org.apache.spark.ml.recommendation.ALS
setOffsetCol(String) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression: Sets the value of param offsetCol.
setOffsetRange(Optional<Offset>, Optional<Offset>) - Method in interface org.apache.spark.sql.sources.v2.reader.streaming.MicroBatchReader: Set the desired offset range for input partitions created from this reader.
setOptimizeDocConcentration(boolean) - Method in class org.apache.spark.ml.clustering.LDA
setOptimizeDocConcentration(boolean) - Method in class org.apache.spark.mllib.clustering.OnlineLDAOptimizer: Sets whether to optimize docConcentration parameter during training.
setOptimizer(String) - Method in class org.apache.spark.ml.clustering.LDA
setOptimizer(LDAOptimizer) - Method in class org.apache.spark.mllib.clustering.LDA: :: DeveloperApi ::
setOptimizer(String) - Method in class org.apache.spark.mllib.clustering.LDA: Set the LDAOptimizer used to perform the actual calculation by algorithm name.
setOrNull(long, int, int) - Method in class org.apache.spark.sql.types.Decimal: Set this Decimal to the given unscaled Long, with a given precision and scale, and return it, or return null if it cannot be set due to overflow.
setOut(PrintStream) - Method in interface org.apache.spark.sql.hive.client.HiveClient
setOutputCol(String) - Method in class org.apache.spark.ml.feature.Binarizer
setOutputCol(String) - Method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSH
setOutputCol(String) - Method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSHModel
setOutputCol(String) - Method in class org.apache.spark.ml.feature.Bucketizer
setOutputCol(String) - Method in class org.apache.spark.ml.feature.ChiSqSelector
setOutputCol(String) - Method in class org.apache.spark.ml.feature.ChiSqSelectorModel
setOutputCol(String) - Method in class org.apache.spark.ml.feature.CountVectorizer
setOutputCol(String) - Method in class org.apache.spark.ml.feature.CountVectorizerModel
setOutputCol(String) - Method in class org.apache.spark.ml.feature.FeatureHasher
setOutputCol(String) - Method in class org.apache.spark.ml.feature.HashingTF
setOutputCol(String) - Method in class org.apache.spark.ml.feature.IDF
setOutputCol(String) - Method in class org.apache.spark.ml.feature.IDFModel
setOutputCol(String) - Method in class org.apache.spark.ml.feature.IndexToString
setOutputCol(String) - Method in class org.apache.spark.ml.feature.Interaction
setOutputCol(String) - Method in class org.apache.spark.ml.feature.MaxAbsScaler
setOutputCol(String) - Method in class org.apache.spark.ml.feature.MaxAbsScalerModel
setOutputCol(String) - Method in class org.apache.spark.ml.feature.MinHashLSH
setOutputCol(String) - Method in class org.apache.spark.ml.feature.MinHashLSHModel
setOutputCol(String) - Method in class org.apache.spark.ml.feature.MinMaxScaler
setOutputCol(String) - Method in class org.apache.spark.ml.feature.MinMaxScalerModel
setOutputCol(String) - Method in class org.apache.spark.ml.feature.OneHotEncoder: Deprecated.
setOutputCol(String) - Method in class org.apache.spark.ml.feature.PCA
setOutputCol(String) - Method in class org.apache.spark.ml.feature.PCAModel
setOutputCol(String) - Method in class org.apache.spark.ml.feature.QuantileDiscretizer
setOutputCol(String) - Method in class org.apache.spark.ml.feature.StandardScaler
setOutputCol(String) - Method in class org.apache.spark.ml.feature.StandardScalerModel
setOutputCol(String) - Method in class org.apache.spark.ml.feature.StopWordsRemover
setOutputCol(String) - Method in class org.apache.spark.ml.feature.StringIndexer
setOutputCol(String) - Method in class org.apache.spark.ml.feature.StringIndexerModel
setOutputCol(String) - Method in class org.apache.spark.ml.feature.VectorAssembler
setOutputCol(String) - Method in class org.apache.spark.ml.feature.VectorIndexer
setOutputCol(String) - Method in class org.apache.spark.ml.feature.VectorIndexerModel
setOutputCol(String) - Method in class org.apache.spark.ml.feature.VectorSlicer
setOutputCol(String) - Method in class org.apache.spark.ml.feature.Word2Vec
setOutputCol(String) - Method in class org.apache.spark.ml.feature.Word2VecModel
setOutputCol(String) - Method in class org.apache.spark.ml.UnaryTransformer
setOutputCols(String[]) - Method in class org.apache.spark.ml.feature.Bucketizer
setOutputCols(String[]) - Method in class org.apache.spark.ml.feature.Imputer
setOutputCols(String[]) - Method in class org.apache.spark.ml.feature.ImputerModel
setOutputCols(String[]) - Method in class org.apache.spark.ml.feature.OneHotEncoderEstimator
setOutputCols(String[]) - Method in class org.apache.spark.ml.feature.OneHotEncoderModel
setOutputCols(String[]) - Method in class org.apache.spark.ml.feature.QuantileDiscretizer
setP(double) - Method in class org.apache.spark.ml.feature.Normalizer
setParallelism(int) - Method in class org.apache.spark.ml.classification.OneVsRest: The implementation of parallel one vs.
setParallelism(int) - Method in class org.apache.spark.ml.tuning.CrossValidator: Set the maximum level of parallelism to evaluate models in parallel.
setParallelism(int) - Method in class org.apache.spark.ml.tuning.TrainValidationSplit: Set the maximum level of parallelism to evaluate models in parallel.
setParent(Estimator<M>) - Method in class org.apache.spark.ml.Model: Sets the parent of this model (Java API).
setPattern(String) - Method in class org.apache.spark.ml.feature.RegexTokenizer
setPeacePeriod(int) - Method in class org.apache.spark.mllib.stat.test.StreamingTest: Set the number of initial batches to ignore.
setPercentile(double) - Method in class org.apache.spark.ml.feature.ChiSqSelector
setPercentile(double) - Method in class org.apache.spark.mllib.feature.ChiSqSelector
setPredictionCol(String) - Method in class org.apache.spark.ml.classification.OneVsRest
setPredictionCol(String) - Method in class org.apache.spark.ml.classification.OneVsRestModel
setPredictionCol(String) - Method in class org.apache.spark.ml.clustering.BisectingKMeans
setPredictionCol(String) - Method in class org.apache.spark.ml.clustering.BisectingKMeansModel
setPredictionCol(String) - Method in class org.apache.spark.ml.clustering.GaussianMixture
setPredictionCol(String) - Method in class org.apache.spark.ml.clustering.GaussianMixtureModel
setPredictionCol(String) - Method in class org.apache.spark.ml.clustering.KMeans
setPredictionCol(String) - Method in class org.apache.spark.ml.clustering.KMeansModel
setPredictionCol(String) - Method in class org.apache.spark.ml.evaluation.ClusteringEvaluator
setPredictionCol(String) - Method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
setPredictionCol(String) - Method in class org.apache.spark.ml.evaluation.RegressionEvaluator
setPredictionCol(String) - Method in class org.apache.spark.ml.fpm.FPGrowth
setPredictionCol(String) - Method in class org.apache.spark.ml.fpm.FPGrowthModel
setPredictionCol(String) - Method in class org.apache.spark.ml.PredictionModel
setPredictionCol(String) - Method in class org.apache.spark.ml.Predictor
setPredictionCol(String) - Method in class org.apache.spark.ml.recommendation.ALS
setPredictionCol(String) - Method in class org.apache.spark.ml.recommendation.ALSModel
setPredictionCol(String) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
setPredictionCol(String) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
setPredictionCol(String) - Method in class org.apache.spark.ml.regression.IsotonicRegression
setPredictionCol(String) - Method in class org.apache.spark.ml.regression.IsotonicRegressionModel
setProbabilityCol(String) - Method in class org.apache.spark.ml.classification.ProbabilisticClassificationModel
setProbabilityCol(String) - Method in class org.apache.spark.ml.classification.ProbabilisticClassifier
setProbabilityCol(String) - Method in class org.apache.spark.ml.clustering.GaussianMixture
setProbabilityCol(String) - Method in class org.apache.spark.ml.clustering.GaussianMixtureModel
setProductBlocks(int) - Method in class org.apache.spark.mllib.recommendation.ALS: Set the number of product blocks to parallelize the computation.
setPropertiesFile(String) - Method in class org.apache.spark.launcher.AbstractLauncher: Set a custom properties file with Spark configuration for the application.
setPropertiesFile(String) - Method in class org.apache.spark.launcher.SparkLauncher
setQuantileCalculationStrategy(Enumeration.Value) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
setQuantileProbabilities(double[]) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
setQuantileProbabilities(double[]) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
setQuantilesCol(String) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
setQuantilesCol(String) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
setRandomCenters(int, double, long) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans: Initialize random centers, requiring only the number of dimensions.
setRank(int) - Method in class org.apache.spark.ml.recommendation.ALS
setRank(int) - Method in class org.apache.spark.mllib.recommendation.ALS: Set the rank of the feature matrices computed (number of features).
setRatingCol(String) - Method in class org.apache.spark.ml.recommendation.ALS
setRawPredictionCol(String) - Method in class org.apache.spark.ml.classification.ClassificationModel
setRawPredictionCol(String) - Method in class org.apache.spark.ml.classification.Classifier
setRawPredictionCol(String) - Method in class org.apache.spark.ml.classification.OneVsRest
setRawPredictionCol(String) - Method in class org.apache.spark.ml.classification.OneVsRestModel
setRawPredictionCol(String) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
setRegParam(double) - Method in class org.apache.spark.ml.classification.LinearSVC: Set the regularization parameter.
setRegParam(double) - Method in class org.apache.spark.ml.classification.LogisticRegression: Set the regularization parameter.
setRegParam(double) - Method in class org.apache.spark.ml.recommendation.ALS
setRegParam(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression: Sets the regularization parameter for L2 regularization.
setRegParam(double) - Method in class org.apache.spark.ml.regression.LinearRegression: Set the regularization parameter.
setRegParam(double) - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD: Set the regularization parameter.
setRegParam(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent: Set the regularization parameter.
setRegParam(double) - Method in class org.apache.spark.mllib.optimization.LBFGS: Set the regularization parameter.
setRegParam(double) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD: Set the regularization parameter.
setRelativeError(double) - Method in class org.apache.spark.ml.feature.QuantileDiscretizer
setRequiredColumns(Configuration, StructType, StructType) - Static method in class org.apache.spark.sql.hive.orc.OrcFileFormat
setRest(long, int, VD, ED) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
setRuns(int) - Method in class org.apache.spark.mllib.clustering.KMeans: Deprecated.
This has no effect. Since 2.1.0.
setSample(RDD<Object>) - Method in class org.apache.spark.mllib.stat.KernelDensity: Sets the sample to use for density estimation.
setSample(JavaRDD<Double>) - Method in class org.apache.spark.mllib.stat.KernelDensity: Sets the sample to use for density estimation (for Java users).
setScalingVec(Vector) - Method in class org.apache.spark.ml.feature.ElementwiseProduct
setSeed(long) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
setSeed(long) - Method in class org.apache.spark.ml.classification.GBTClassifier
setSeed(long) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier: Set the seed for weights initialization if weights are not set
setSeed(long) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
setSeed(long) - Method in class org.apache.spark.ml.clustering.BisectingKMeans
setSeed(long) - Method in class org.apache.spark.ml.clustering.GaussianMixture
setSeed(long) - Method in class org.apache.spark.ml.clustering.KMeans
setSeed(long) - Method in class org.apache.spark.ml.clustering.LDA
setSeed(long) - Method in class org.apache.spark.ml.clustering.LDAModel
setSeed(long) - Method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSH
setSeed(long) - Method in class org.apache.spark.ml.feature.MinHashLSH
setSeed(long) - Method in class org.apache.spark.ml.feature.Word2Vec
setSeed(long) - Method in class org.apache.spark.ml.recommendation.ALS
setSeed(long) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
setSeed(long) - Method in class org.apache.spark.ml.regression.GBTRegressor
setSeed(long) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
setSeed(long) - Method in interface org.apache.spark.ml.tree.DecisionTreeParams: Deprecated.
This method is deprecated and will be removed in 3.0.0.
setSeed(long) - Method in class org.apache.spark.ml.tuning.CrossValidator
setSeed(long) - Method in class org.apache.spark.ml.tuning.TrainValidationSplit
setSeed(long) - Method in class org.apache.spark.mllib.clustering.BisectingKMeans: Sets the random seed (default: hash value of the class name).
setSeed(long) - Method in class org.apache.spark.mllib.clustering.GaussianMixture: Set the random seed
setSeed(long) - Method in class org.apache.spark.mllib.clustering.KMeans: Set the random seed for cluster initialization.
setSeed(long) - Method in class org.apache.spark.mllib.clustering.LDA: Set the random seed for cluster initialization.
setSeed(long) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel: Set the random seed for cluster initialization.
setSeed(long) - Method in class org.apache.spark.mllib.feature.Word2Vec: Sets random seed (default: a random long integer).
setSeed(long) - Method in class org.apache.spark.mllib.random.ExponentialGenerator
setSeed(long) - Method in class org.apache.spark.mllib.random.GammaGenerator
setSeed(long) - Method in class org.apache.spark.mllib.random.LogNormalGenerator
setSeed(long) - Method in class org.apache.spark.mllib.random.PoissonGenerator
setSeed(long) - Method in class org.apache.spark.mllib.random.StandardNormalGenerator
setSeed(long) - Method in class org.apache.spark.mllib.random.UniformGenerator
setSeed(long) - Method in class org.apache.spark.mllib.random.WeibullGenerator
setSeed(long) - Method in class org.apache.spark.mllib.recommendation.ALS: Sets a random seed to have deterministic results.
setSeed(long) - Method in class org.apache.spark.util.random.BernoulliCellSampler
setSeed(long) - Method in class org.apache.spark.util.random.BernoulliSampler
setSeed(long) - Method in class org.apache.spark.util.random.PoissonSampler
setSeed(long) - Method in interface org.apache.spark.util.random.Pseudorandom: Set random seed.
setSelectorType(String) - Method in class org.apache.spark.ml.feature.ChiSqSelector
setSelectorType(String) - Method in class org.apache.spark.mllib.feature.ChiSqSelector
setSequenceCol(String) - Method in class org.apache.spark.ml.fpm.PrefixSpan
setSerializer(Serializer) - Method in class org.apache.spark.rdd.CoGroupedRDD: Set a serializer for this RDD's shuffle, or null to use the default (spark.serializer)
setSerializer(Serializer) - Method in class org.apache.spark.rdd.ShuffledRDD: Set a serializer for this RDD's shuffle, or null to use the default (spark.serializer)
setSize(int) - Method in class org.apache.spark.ml.feature.VectorSizeHint
setSmoothing(double) - Method in class org.apache.spark.ml.classification.NaiveBayes: Set the smoothing parameter.
setSolver(String) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier: Sets the value of param solver.
setSolver(String) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression: Sets the solver algorithm used for optimization.
setSolver(String) - Method in class org.apache.spark.ml.regression.LinearRegression: Set the solver algorithm used for optimization.
setSparkContextSessionConf(SparkSession, Map<Object, Object>) - Static method in class org.apache.spark.sql.api.r.SQLUtils
setSparkHome(String) - Method in class org.apache.spark.launcher.SparkLauncher: Set a custom Spark installation location for the application.
setSparkHome(String) - Method in class org.apache.spark.SparkConf: Set the location where Spark is installed on worker nodes.
setSplits(double[]) - Method in class org.apache.spark.ml.feature.Bucketizer
setSplitsArray(double[][]) - Method in class org.apache.spark.ml.feature.Bucketizer
setSQLReadObject(Function2<DataInputStream, Object, Object>) - Static method in class org.apache.spark.api.r.SerDe
setSQLWriteObject(Function2<DataOutputStream, Object, Object>) - Static method in class org.apache.spark.api.r.SerDe
setSrcCol(String) - Method in class org.apache.spark.ml.clustering.PowerIterationClustering
setSrcOnly(long, int, VD) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
setStages(PipelineStage[]) - Method in class org.apache.spark.ml.Pipeline
setStandardization(boolean) - Method in class org.apache.spark.ml.classification.LinearSVC: Whether to standardize the training features before fitting the model.
setStandardization(boolean) - Method in class org.apache.spark.ml.classification.LogisticRegression: Whether to standardize the training features before fitting the model.
setStandardization(boolean) - Method in class org.apache.spark.ml.regression.LinearRegression: Whether to standardize the training features before fitting the model.
setStartOffset(Optional<Offset>) - Method in interface org.apache.spark.sql.sources.v2.reader.streaming.ContinuousReader: Set the desired start offset for partitions created from this reader.
setStatement(String) - Method in class org.apache.spark.ml.feature.SQLTransformer
setStepSize(double) - Method in class org.apache.spark.ml.classification.GBTClassifier
setStepSize(double) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier: Sets the value of param stepSize (applicable only for solver "gd").
setStepSize(double) - Method in class org.apache.spark.ml.feature.Word2Vec
setStepSize(double) - Method in class org.apache.spark.ml.regression.GBTRegressor
setStepSize(double) - Method in interface org.apache.spark.ml.tree.GBTParams: Deprecated.
This method is deprecated and will be removed in 3.0.0.
setStepSize(double) - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD: Set the step size for gradient descent.
setStepSize(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent: Set the initial step size of SGD for the first step.
setStepSize(double) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD: Set the step size for gradient descent.
setStopWords(String[]) - Method in class org.apache.spark.ml.feature.StopWordsRemover
setStorageLevel(String) - Method in class org.apache.spark.status.LiveRDD
setStrategy(String) - Method in class org.apache.spark.ml.feature.Imputer: Imputation strategy.
setStringIndexerOrderType(String) - Method in class org.apache.spark.ml.feature.RFormula
setStringOrderType(String) - Method in class org.apache.spark.ml.feature.StringIndexer
setSubsamplingRate(double) - Method in class org.apache.spark.ml.classification.GBTClassifier
setSubsamplingRate(double) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
setSubsamplingRate(double) - Method in class org.apache.spark.ml.clustering.LDA
setSubsamplingRate(double) - Method in class org.apache.spark.ml.regression.GBTRegressor
setSubsamplingRate(double) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
setSubsamplingRate(double) - Method in interface org.apache.spark.ml.tree.TreeEnsembleParams: Deprecated.
This method is deprecated and will be removed in 3.0.0.
setSubsamplingRate(double) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
setTau0(double) - Method in class org.apache.spark.mllib.clustering.OnlineLDAOptimizer: A (positive) learning parameter that downweights early iterations.
setTestMethod(String) - Method in class org.apache.spark.mllib.stat.test.StreamingTest: Set the statistical method used for significance testing.
setThreshold(double) - Method in class org.apache.spark.ml.classification.LinearSVC: Set threshold in binary classification.
setThreshold(double) - Method in class org.apache.spark.ml.classification.LinearSVCModel
setThreshold(double) - Method in class org.apache.spark.ml.classification.LogisticRegression
setThreshold(double) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
setThreshold(double) - Method in interface org.apache.spark.ml.classification.LogisticRegressionParams: Set threshold in binary classification, in range [0, 1].
setThreshold(double) - Method in class org.apache.spark.ml.feature.Binarizer
setThreshold(double) - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel: Sets the threshold that separates positive predictions from negative predictions in Binary Logistic Regression.
setThreshold(double) - Method in class org.apache.spark.mllib.classification.SVMModel: Sets the threshold that separates positive predictions from negative predictions.
setThresholds(double[]) - Method in class org.apache.spark.ml.classification.LogisticRegression
setThresholds(double[]) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
setThresholds(double[]) - Method in interface org.apache.spark.ml.classification.LogisticRegressionParams: Set thresholds in multiclass (or binary) classification to adjust the probability of predicting each class.
setThresholds(double[]) - Method in class org.apache.spark.ml.classification.ProbabilisticClassificationModel
setThresholds(double[]) - Method in class org.apache.spark.ml.classification.ProbabilisticClassifier
setTimeoutDuration(long) - Method in interface org.apache.spark.sql.streaming.GroupState: Set the timeout duration in ms for this key.
setTimeoutDuration(String) - Method in interface org.apache.spark.sql.streaming.GroupState: Set the timeout duration for this key as a string.
setTimeoutTimestamp(long) - Method in interface org.apache.spark.sql.streaming.GroupState: Set the timeout timestamp for this key as milliseconds in epoch time.
setTimeoutTimestamp(long, String) - Method in interface org.apache.spark.sql.streaming.GroupState: Set the timeout timestamp for this key as milliseconds in epoch time and an additional duration as a string (e.g.
setTimeoutTimestamp(Date) - Method in interface org.apache.spark.sql.streaming.GroupState: Set the timeout timestamp for this key as a java.sql.Date.
setTimeoutTimestamp(Date, String) - Method in interface org.apache.spark.sql.streaming.GroupState: Set the timeout timestamp for this key as a java.sql.Date and an additional duration as a string (e.g.
setTol(double) - Method in class org.apache.spark.ml.classification.LinearSVC: Set the convergence tolerance of iterations.
setTol(double) - Method in class org.apache.spark.ml.classification.LogisticRegression: Set the convergence tolerance of iterations.
setTol(double) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier: Set the convergence tolerance of iterations.
setTol(double) - Method in class org.apache.spark.ml.clustering.GaussianMixture
setTol(double) - Method in class org.apache.spark.ml.clustering.KMeans
setTol(double) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression: Set the convergence tolerance of iterations.
setTol(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression: Sets the convergence tolerance of iterations.
setTol(double) - Method in class org.apache.spark.ml.regression.LinearRegression: Set the convergence tolerance of iterations.
setToLowercase(boolean) - Method in class org.apache.spark.ml.feature.RegexTokenizer
setTopicConcentration(double) - Method in class org.apache.spark.ml.clustering.LDA
setTopicConcentration(double) - Method in class org.apache.spark.mllib.clustering.LDA: Concentration parameter (commonly named "beta" or "eta") for the prior placed on topics' distributions over terms.
setTopicDistributionCol(String) - Method in class org.apache.spark.ml.clustering.LDA
setTopicDistributionCol(String) - Method in class org.apache.spark.ml.clustering.LDAModel
setTrainRatio(double) - Method in class org.apache.spark.ml.tuning.TrainValidationSplit
setTreeStrategy(Strategy) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
setUiRoot(ContextHandler, UIRoot) - Static method in class org.apache.spark.status.api.v1.UIRootFromServletContext
setupCommitter(TaskAttemptContext) - Method in class org.apache.spark.internal.io.HadoopMapRedCommitProtocol
setUpdater(Updater) - Method in class org.apache.spark.mllib.optimization.GradientDescent: Set the updater function to actually perform a gradient step in a given direction.
setUpdater(Updater) - Method in class org.apache.spark.mllib.optimization.LBFGS: Set the updater function to actually perform a gradient step in a given direction.
SetupDriver(org.apache.spark.rpc.RpcEndpointRef) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.SetupDriver
SetupDriver$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.SetupDriver$
setupGroups(int, DefaultPartitionCoalescer.PartitionLocations) - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer: Initializes targetLen partition groups.
setupJob(JobContext) - Method in class org.apache.spark.internal.io.FileCommitProtocol: Setups up a job.
setupJob(JobContext) - Method in class org.apache.spark.internal.io.HadoopMapReduceCommitProtocol
setUpperBoundsOnCoefficients(Matrix) - Method in class org.apache.spark.ml.classification.LogisticRegression: Set the upper bounds on coefficients if fitting under bound constrained optimization.
setUpperBoundsOnIntercepts(Vector) - Method in class org.apache.spark.ml.classification.LogisticRegression: Set the upper bounds on intercepts if fitting under bound constrained optimization.
setupTask(TaskAttemptContext) - Method in class org.apache.spark.internal.io.FileCommitProtocol: Sets up a task within a job.
setupTask(TaskAttemptContext) - Method in class org.apache.spark.internal.io.HadoopMapReduceCommitProtocol
setupUI(org.apache.spark.ui.SparkUI) - Method in interface org.apache.spark.status.AppHistoryServerPlugin: Sets up UI of this plugin to rebuild the history UI.
setUseNodeIdCache(boolean) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
setUserBlocks(int) - Method in class org.apache.spark.mllib.recommendation.ALS: Set the number of user blocks to parallelize the computation.
setUserCol(String) - Method in class org.apache.spark.ml.recommendation.ALS
setUserCol(String) - Method in class org.apache.spark.ml.recommendation.ALSModel
setValidateData(boolean) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm: Set if the algorithm should validate data before training.
setValidationIndicatorCol(String) - Method in class org.apache.spark.ml.classification.GBTClassifier
setValidationIndicatorCol(String) - Method in class org.apache.spark.ml.regression.GBTRegressor
setValidationTol(double) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
setValue(R) - Method in class org.apache.spark.Accumulable: Deprecated.

Set the accumulator's value.
setVarianceCol(String) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
setVarianceCol(String) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
setVariancePower(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression: Sets the value of param variancePower.
setVectorSize(int) - Method in class org.apache.spark.ml.feature.Word2Vec
setVectorSize(int) - Method in class org.apache.spark.mllib.feature.Word2Vec: Sets vector size (default: 100).
setVerbose(boolean) - Method in class org.apache.spark.launcher.AbstractLauncher: Enables verbose reporting for SparkSubmit.
setVerbose(boolean) - Method in class org.apache.spark.launcher.SparkLauncher
setVocabSize(int) - Method in class org.apache.spark.ml.feature.CountVectorizer
setWeightCol(String) - Method in class org.apache.spark.ml.classification.LinearSVC: Set the value of param weightCol.
setWeightCol(double) - Method in class org.apache.spark.ml.classification.LinearSVCModel: Deprecated.
This method is deprecated and will be removed in 3.0.0. Since 2.4.4.
setWeightCol(String) - Method in class org.apache.spark.ml.classification.LogisticRegression: Sets the value of param weightCol.
setWeightCol(String) - Method in class org.apache.spark.ml.classification.NaiveBayes: Sets the value of param weightCol.
setWeightCol(String) - Method in class org.apache.spark.ml.classification.OneVsRest: Sets the value of param weightCol.
setWeightCol(String) - Method in class org.apache.spark.ml.clustering.PowerIterationClustering
setWeightCol(String) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression: Sets the value of param weightCol.
setWeightCol(String) - Method in class org.apache.spark.ml.regression.IsotonicRegression
setWeightCol(String) - Method in class org.apache.spark.ml.regression.LinearRegression: Whether to over-/under-sample training instances according to the given weights in weightCol.
setWindowSize(int) - Method in class org.apache.spark.ml.feature.Word2Vec
setWindowSize(int) - Method in class org.apache.spark.mllib.feature.Word2Vec: Sets the window of words (default: 5)
setWindowSize(int) - Method in class org.apache.spark.mllib.stat.test.StreamingTest: Set the number of batches to compute significance tests over.
setWithMean(boolean) - Method in class org.apache.spark.ml.feature.StandardScaler
setWithMean(boolean) - Method in class org.apache.spark.mllib.feature.StandardScalerModel: :: DeveloperApi ::
setWithStd(boolean) - Method in class org.apache.spark.ml.feature.StandardScaler
setWithStd(boolean) - Method in class org.apache.spark.mllib.feature.StandardScalerModel: :: DeveloperApi ::
sha1(Column) - Static method in class org.apache.spark.sql.functions: Calculates the SHA-1 digest of a binary column and returns the value as a 40 character hex string.
sha2(Column, int) - Static method in class org.apache.spark.sql.functions: Calculates the SHA-2 family of hash functions of a binary column and returns the value as a hex string.
shape() - Method in class org.apache.spark.mllib.random.GammaGenerator
SharedParamsCodeGen - Class in org.apache.spark.ml.param.shared: Code generator for shared params (sharedParams.scala).
SharedParamsCodeGen() - Constructor for class org.apache.spark.ml.param.shared.SharedParamsCodeGen
SharedReadWrite$() - Constructor for class org.apache.spark.ml.Pipeline.SharedReadWrite$
sharedState() - Method in class org.apache.spark.sql.SparkSession: State shared across sessions, including the SparkContext, cached data, listener, and a catalog that interacts with external systems.
shiftLeft(Column, int) - Static method in class org.apache.spark.sql.functions: Shift the given value numBits left.
shiftRight(Column, int) - Static method in class org.apache.spark.sql.functions: (Signed) shift the given value numBits right.
shiftRightUnsigned(Column, int) - Static method in class org.apache.spark.sql.functions: Unsigned shift the given value numBits right.
SHORT() - Static method in class org.apache.spark.sql.Encoders: An encoder for nullable short type.
ShortestPaths - Class in org.apache.spark.graphx.lib: Computes shortest paths to the given set of landmark vertices, returning a graph where each vertex attribute is a map containing the shortest-path distance to each reachable landmark.
ShortestPaths() - Constructor for class org.apache.spark.graphx.lib.ShortestPaths
shortName() - Method in interface org.apache.spark.ml.util.MLFormatRegister
shortName() - Method in class org.apache.spark.sql.hive.execution.HiveFileFormat
shortName() - Method in class org.apache.spark.sql.hive.orc.OrcFileFormat
shortName() - Method in interface org.apache.spark.sql.sources.DataSourceRegister: The string that represents the format that this data source provider uses.
shortTimeUnitString(TimeUnit) - Static method in class org.apache.spark.streaming.ui.UIUtils: Return the short string for a TimeUnit.
ShortType - Static variable in class org.apache.spark.sql.types.DataTypes: Gets the ShortType object.
ShortType - Class in org.apache.spark.sql.types: The data type representing Short values.
ShortType() - Constructor for class org.apache.spark.sql.types.ShortType
shouldCloseFileAfterWrite(SparkConf, boolean) - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
shouldDistributeGaussians(int, int) - Static method in class org.apache.spark.mllib.clustering.GaussianMixture: Heuristic to distribute the computation of the MultivariateGaussians, approximately when d is greater than 25 except for when k is very small.
shouldGoLeft(Vector) - Method in interface org.apache.spark.ml.tree.Split: Return true (split to left) or false (split to right).
shouldGoLeft(int, Split[]) - Method in interface org.apache.spark.ml.tree.Split: Return true (split to left) or false (split to right).
shouldOwn(Param<?>) - Method in interface org.apache.spark.ml.param.Params: Validates that the input param belongs to this instance.
shouldRollover(long) - Method in interface org.apache.spark.util.logging.RollingPolicy: Whether rollover should be initiated at this moment
show(int) - Method in class org.apache.spark.sql.Dataset: Displays the Dataset in a tabular form.
show() - Method in class org.apache.spark.sql.Dataset: Displays the top 20 rows of Dataset in a tabular form.
show(boolean) - Method in class org.apache.spark.sql.Dataset: Displays the top 20 rows of Dataset in a tabular form.
show(int, boolean) - Method in class org.apache.spark.sql.Dataset: Displays the Dataset in a tabular form.
show(int, int) - Method in class org.apache.spark.sql.Dataset: Displays the Dataset in a tabular form.
show(int, int, boolean) - Method in class org.apache.spark.sql.Dataset: Displays the Dataset in a tabular form.
showBytesDistribution(String, Function2<TaskInfo, TaskMetrics, Object>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
showBytesDistribution(String, Option<org.apache.spark.util.Distribution>) - Static method in class org.apache.spark.scheduler.StatsReportListener
showBytesDistribution(String, org.apache.spark.util.Distribution) - Static method in class org.apache.spark.scheduler.StatsReportListener
showDagVizForJob(int, Seq<org.apache.spark.ui.scope.RDDOperationGraph>) - Static method in class org.apache.spark.ui.UIUtils: Return a "DAG visualization" DOM element that expands into a visualization for a job.
showDagVizForStage(int, Option<org.apache.spark.ui.scope.RDDOperationGraph>) - Static method in class org.apache.spark.ui.UIUtils: Return a "DAG visualization" DOM element that expands into a visualization for a stage.
showDistribution(String, org.apache.spark.util.Distribution, Function1<Object, String>) - Static method in class org.apache.spark.scheduler.StatsReportListener
showDistribution(String, Option<org.apache.spark.util.Distribution>, Function1<Object, String>) - Static method in class org.apache.spark.scheduler.StatsReportListener
showDistribution(String, Option<org.apache.spark.util.Distribution>, String) - Static method in class org.apache.spark.scheduler.StatsReportListener
showDistribution(String, String, Function2<TaskInfo, TaskMetrics, Object>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
showMillisDistribution(String, Option<org.apache.spark.util.Distribution>) - Static method in class org.apache.spark.scheduler.StatsReportListener
showMillisDistribution(String, Function2<TaskInfo, TaskMetrics, Object>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
showMillisDistribution(String, Function1<BatchInfo, Option<Object>>) - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
shuffle(Column) - Static method in class org.apache.spark.sql.functions: Returns a random permutation of the given array.
SHUFFLE() - Static method in class org.apache.spark.storage.BlockId
SHUFFLE_DATA() - Static method in class org.apache.spark.storage.BlockId
SHUFFLE_INDEX() - Static method in class org.apache.spark.storage.BlockId
SHUFFLE_LOCAL_BLOCKS() - Static method in class org.apache.spark.status.TaskIndexNames
SHUFFLE_READ() - Static method in class org.apache.spark.ui.ToolTips
SHUFFLE_READ_BLOCKED_TIME() - Static method in class org.apache.spark.ui.jobs.TaskDetailsClassNames
SHUFFLE_READ_BLOCKED_TIME() - Static method in class org.apache.spark.ui.ToolTips
SHUFFLE_READ_METRICS_PREFIX() - Static method in class org.apache.spark.InternalAccumulator
SHUFFLE_READ_RECORDS() - Static method in class org.apache.spark.status.TaskIndexNames
SHUFFLE_READ_REMOTE_SIZE() - Static method in class org.apache.spark.ui.jobs.TaskDetailsClassNames
SHUFFLE_READ_REMOTE_SIZE() - Static method in class org.apache.spark.ui.ToolTips
SHUFFLE_READ_TIME() - Static method in class org.apache.spark.status.TaskIndexNames
SHUFFLE_REMOTE_BLOCKS() - Static method in class org.apache.spark.status.TaskIndexNames
SHUFFLE_REMOTE_READS() - Static method in class org.apache.spark.status.TaskIndexNames
SHUFFLE_REMOTE_READS_TO_DISK() - Static method in class org.apache.spark.status.TaskIndexNames
SHUFFLE_TOTAL_BLOCKS() - Static method in class org.apache.spark.status.TaskIndexNames
SHUFFLE_TOTAL_READS() - Static method in class org.apache.spark.status.TaskIndexNames
SHUFFLE_WRITE() - Static method in class org.apache.spark.ui.ToolTips
SHUFFLE_WRITE_METRICS_PREFIX() - Static method in class org.apache.spark.InternalAccumulator
SHUFFLE_WRITE_RECORDS() - Static method in class org.apache.spark.status.TaskIndexNames
SHUFFLE_WRITE_SIZE() - Static method in class org.apache.spark.status.TaskIndexNames
SHUFFLE_WRITE_TIME() - Static method in class org.apache.spark.status.TaskIndexNames
ShuffleBlockId - Class in org.apache.spark.storage
ShuffleBlockId(int, int, int) - Constructor for class org.apache.spark.storage.ShuffleBlockId
shuffleCleaned(int) - Method in interface org.apache.spark.CleanerListener
ShuffleDataBlockId - Class in org.apache.spark.storage
ShuffleDataBlockId(int, int, int) - Constructor for class org.apache.spark.storage.ShuffleDataBlockId
ShuffleDependency<K,V,C> - Class in org.apache.spark: :: DeveloperApi :: Represents a dependency on the output of a shuffle stage.
ShuffleDependency(RDD<? extends Product2<K, V>>, Partitioner, Serializer, Option<Ordering<K>>, Option<Aggregator<K, V, C>>, boolean, ClassTag<K>, ClassTag<V>, ClassTag<C>) - Constructor for class org.apache.spark.ShuffleDependency
ShuffledRDD<K,V,C> - Class in org.apache.spark.rdd: :: DeveloperApi :: The resulting RDD from a shuffle (e.g.
ShuffledRDD(RDD<? extends Product2<K, V>>, Partitioner, ClassTag<K>, ClassTag<V>, ClassTag<C>) - Constructor for class org.apache.spark.rdd.ShuffledRDD
shuffleHandle() - Method in class org.apache.spark.ShuffleDependency
shuffleId() - Method in class org.apache.spark.CleanShuffle
shuffleId() - Method in class org.apache.spark.FetchFailed
shuffleId() - Method in class org.apache.spark.ShuffleDependency
shuffleId() - Method in class org.apache.spark.storage.BlockManagerMessages.RemoveShuffle
shuffleId() - Method in class org.apache.spark.storage.ShuffleBlockId
shuffleId() - Method in class org.apache.spark.storage.ShuffleDataBlockId
shuffleId() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
ShuffleIndexBlockId - Class in org.apache.spark.storage
ShuffleIndexBlockId(int, int, int) - Constructor for class org.apache.spark.storage.ShuffleIndexBlockId
shuffleManager() - Method in class org.apache.spark.SparkEnv
shuffleRead() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
shuffleRead$() - Constructor for class org.apache.spark.InternalAccumulator.shuffleRead$
shuffleReadBytes() - Method in class org.apache.spark.status.api.v1.StageData
ShuffleReadMetricDistributions - Class in org.apache.spark.status.api.v1
ShuffleReadMetrics - Class in org.apache.spark.status.api.v1
shuffleReadMetrics() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
shuffleReadMetrics() - Method in class org.apache.spark.status.api.v1.TaskMetrics
shuffleReadRecords() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
shuffleReadRecords() - Method in class org.apache.spark.status.api.v1.StageData
ShuffleStatus - Class in org.apache.spark: Helper class used by the MapOutputTrackerMaster to perform bookkeeping for a single ShuffleMapStage.
ShuffleStatus(int) - Constructor for class org.apache.spark.ShuffleStatus
shuffleWrite() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
shuffleWrite$() - Constructor for class org.apache.spark.InternalAccumulator.shuffleWrite$
shuffleWriteBytes() - Method in class org.apache.spark.status.api.v1.StageData
ShuffleWriteMetricDistributions - Class in org.apache.spark.status.api.v1
ShuffleWriteMetrics - Class in org.apache.spark.status.api.v1
shuffleWriteMetrics() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
shuffleWriteMetrics() - Method in class org.apache.spark.status.api.v1.TaskMetrics
shuffleWriteRecords() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
shuffleWriteRecords() - Method in class org.apache.spark.status.api.v1.StageData
shutdown() - Method in interface org.apache.spark.ExecutorPlugin: Clean up and terminate this plugin.
shutdown(ExecutorService, Duration) - Static method in class org.apache.spark.util.ThreadUtils
Shutdown$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.Shutdown$
ShutdownHookManager - Class in org.apache.spark.util: Various utility methods used by Spark.
ShutdownHookManager() - Constructor for class org.apache.spark.util.ShutdownHookManager
sigma() - Method in class org.apache.spark.mllib.stat.distribution.MultivariateGaussian
sigmas() - Method in class org.apache.spark.mllib.clustering.ExpectationSum
SignalUtils - Class in org.apache.spark.util: Contains utilities for working with posix signals.
SignalUtils() - Constructor for class org.apache.spark.util.SignalUtils
signum(Column) - Static method in class org.apache.spark.sql.functions: Computes the signum of the given value.
signum(String) - Static method in class org.apache.spark.sql.functions: Computes the signum of the given column.
SimpleFutureAction<T> - Class in org.apache.spark: A FutureAction holding the result of an action that triggers a single job.
simpleString() - Method in class org.apache.spark.sql.types.ArrayType
simpleString() - Static method in class org.apache.spark.sql.types.BinaryType
simpleString() - Static method in class org.apache.spark.sql.types.BooleanType
simpleString() - Method in class org.apache.spark.sql.types.ByteType
simpleString() - Static method in class org.apache.spark.sql.types.CalendarIntervalType
simpleString() - Method in class org.apache.spark.sql.types.CharType
simpleString() - Method in class org.apache.spark.sql.types.DataType: Readable string representation for the type.
simpleString() - Static method in class org.apache.spark.sql.types.DateType
simpleString() - Method in class org.apache.spark.sql.types.DecimalType
simpleString() - Static method in class org.apache.spark.sql.types.DoubleType
simpleString() - Static method in class org.apache.spark.sql.types.FloatType
simpleString() - Method in class org.apache.spark.sql.types.IntegerType
simpleString() - Method in class org.apache.spark.sql.types.LongType
simpleString() - Method in class org.apache.spark.sql.types.MapType
simpleString() - Static method in class org.apache.spark.sql.types.NullType
simpleString() - Method in class org.apache.spark.sql.types.ObjectType
simpleString() - Method in class org.apache.spark.sql.types.ShortType
simpleString() - Static method in class org.apache.spark.sql.types.StringType
simpleString() - Method in class org.apache.spark.sql.types.StructType
simpleString() - Static method in class org.apache.spark.sql.types.TimestampType
simpleString() - Method in class org.apache.spark.sql.types.VarcharType
SimpleUpdater - Class in org.apache.spark.mllib.optimization: :: DeveloperApi :: A simple updater for gradient descent *without* any regularization.
SimpleUpdater() - Constructor for class org.apache.spark.mllib.optimization.SimpleUpdater
sin(Column) - Static method in class org.apache.spark.sql.functions
sin(String) - Static method in class org.apache.spark.sql.functions
SingularValueDecomposition<UType,VType> - Class in org.apache.spark.mllib.linalg: Represents singular value decomposition (SVD) factors.
SingularValueDecomposition(UType, Vector, VType) - Constructor for class org.apache.spark.mllib.linalg.SingularValueDecomposition
sinh(Column) - Static method in class org.apache.spark.sql.functions
sinh(String) - Static method in class org.apache.spark.sql.functions
Sink - Interface in org.apache.spark.metrics.sink
sink() - Method in class org.apache.spark.sql.streaming.StreamingQueryProgress
SinkProgress - Class in org.apache.spark.sql.streaming: Information about progress made for a sink in the execution of a StreamingQuery during a trigger.
size() - Method in class org.apache.spark.api.java.JavaUtils.SerializableMapWrapper
size() - Method in class org.apache.spark.ml.attribute.AttributeGroup: Size of the attribute group.
size() - Method in class org.apache.spark.ml.feature.VectorSizeHint: The size of Vectors in inputCol.
size() - Method in class org.apache.spark.ml.linalg.DenseVector
size() - Method in class org.apache.spark.ml.linalg.SparseVector
size() - Method in interface org.apache.spark.ml.linalg.Vector: Size of the vector.
size() - Method in class org.apache.spark.ml.param.ParamMap: Number of param pairs in this map.
size() - Method in class org.apache.spark.mllib.linalg.DenseVector
size() - Method in class org.apache.spark.mllib.linalg.SparseVector
size() - Method in interface org.apache.spark.mllib.linalg.Vector: Size of the vector.
size(Column) - Static method in class org.apache.spark.sql.functions: Returns length of array or map.
size() - Method in interface org.apache.spark.sql.Row: Number of elements in the Row.
size() - Method in interface org.apache.spark.storage.BlockData
size() - Method in class org.apache.spark.storage.DiskBlockData
size() - Method in class org.apache.spark.storage.memory.DeserializedMemoryEntry
size() - Method in interface org.apache.spark.storage.memory.MemoryEntry
size() - Method in class org.apache.spark.storage.memory.SerializedMemoryEntry
SizeEstimator - Class in org.apache.spark.util: :: DeveloperApi :: Estimates the sizes of Java objects (number of bytes of memory they occupy), for use in memory-aware caches.
SizeEstimator() - Constructor for class org.apache.spark.util.SizeEstimator
sizeInBytes() - Method in class org.apache.spark.sql.sources.BaseRelation: Returns an estimated size of this relation in bytes.
sizeInBytes() - Method in interface org.apache.spark.sql.sources.v2.reader.Statistics
sketch(RDD<K>, int, ClassTag<K>) - Static method in class org.apache.spark.RangePartitioner: Sketches the input RDD via reservoir sampling on each partition.
skewness(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the skewness of the values in a group.
skewness(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the skewness of the values in a group.
skip(long) - Method in class org.apache.spark.io.NioBufferedFileInputStream
skip(long) - Method in class org.apache.spark.io.ReadAheadInputStream
skip(long) - Method in class org.apache.spark.storage.BufferReleasingInputStream
skippedStages() - Method in class org.apache.spark.status.LiveJob
skippedTasks() - Method in class org.apache.spark.status.LiveJob
skipWhitespace() - Static method in class org.apache.spark.ml.feature.RFormulaParser
slice(Column, int, int) - Static method in class org.apache.spark.sql.functions: Returns an array containing all the elements in x from index start (or starting from the end if start is negative) with the specified length.
slice(Time, Time) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return all the RDDs between 'fromDuration' to 'toDuration' (both included)
slice(org.apache.spark.streaming.Interval) - Method in class org.apache.spark.streaming.dstream.DStream: Return all the RDDs defined by the Interval object (both end times included)
slice(Time, Time) - Method in class org.apache.spark.streaming.dstream.DStream: Return all the RDDs between 'fromTime' to 'toTime' (both included)
slideDuration() - Method in class org.apache.spark.streaming.dstream.DStream: Time interval after which the DStream generates an RDD
slideDuration() - Method in class org.apache.spark.streaming.dstream.InputDStream
sliding(int, int) - Method in class org.apache.spark.mllib.rdd.RDDFunctions: Returns an RDD from grouping items of its parent RDD in fixed size blocks by passing a sliding window over them.
sliding(int) - Method in class org.apache.spark.mllib.rdd.RDDFunctions: sliding(Int, Int)* with step = 1.
smoothing() - Method in interface org.apache.spark.ml.classification.NaiveBayesParams: The smoothing parameter.
SnappyCompressionCodec - Class in org.apache.spark.io: :: DeveloperApi :: Snappy implementation of CompressionCodec.
SnappyCompressionCodec(SparkConf) - Constructor for class org.apache.spark.io.SnappyCompressionCodec
SnappyOutputStreamWrapper - Class in org.apache.spark.io: Wrapper over SnappyOutputStream which guards against write-after-close and double-close issues.
SnappyOutputStreamWrapper(SnappyOutputStream) - Constructor for class org.apache.spark.io.SnappyOutputStreamWrapper
socketStream(String, int, Function<InputStream, Iterable<T>>, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream from network source hostname:port.
socketStream(String, int, Function1<InputStream, Iterator<T>>, StorageLevel, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext: Creates an input stream from TCP source hostname:port.
socketTextStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream from network source hostname:port.
socketTextStream(String, int) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream from network source hostname:port.
socketTextStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.StreamingContext: Creates an input stream from TCP source hostname:port.
solve(double, double, DenseVector, DenseVector, DenseVector) - Method in interface org.apache.spark.ml.optim.NormalEquationSolver: Solve the normal equations from summary statistics.
solve(ALS.NormalEquation, double) - Method in interface org.apache.spark.ml.recommendation.ALS.LeastSquaresNESolver: Solves a least squares problem with regularization (possibly with other constraints).
solve(double[], double[]) - Static method in class org.apache.spark.mllib.linalg.CholeskyDecomposition: Solves a symmetric positive definite linear system via Cholesky factorization.
solve(double[], double[], NNLS.Workspace) - Static method in class org.apache.spark.mllib.optimization.NNLS: Solve a least squares problem, possibly with nonnegativity constraints, by a modified projected gradient method.
solver() - Method in interface org.apache.spark.ml.classification.MultilayerPerceptronParams: The solver algorithm for optimization.
solver() - Method in interface org.apache.spark.ml.param.shared.HasSolver: Param for the solver algorithm for optimization.
solver() - Method in interface org.apache.spark.ml.regression.GeneralizedLinearRegressionBase: The solver algorithm for optimization.
solver() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionTrainingSummary
solver() - Method in interface org.apache.spark.ml.regression.LinearRegressionParams: The solver algorithm for optimization.
Sort() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
sort(String, String...) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset sorted by the specified column, all in ascending order.
sort(Column...) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset sorted by the given expressions.
sort(String, Seq<String>) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset sorted by the specified column, all in ascending order.
sort(Seq<Column>) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset sorted by the given expressions.
sort_array(Column) - Static method in class org.apache.spark.sql.functions: Sorts the input array for the given column in ascending order, according to the natural ordering of the array elements.
sort_array(Column, boolean) - Static method in class org.apache.spark.sql.functions: Sorts the input array for the given column in ascending or descending order, according to the natural ordering of the array elements.
sortBy(Function<T, S>, boolean, int) - Method in class org.apache.spark.api.java.JavaRDD: Return this RDD sorted by the given key function.
sortBy(Function1<T, K>, boolean, int, Ordering<K>, ClassTag<K>) - Method in class org.apache.spark.rdd.RDD: Return this RDD sorted by the given key function.
sortBy(String, String...) - Method in class org.apache.spark.sql.DataFrameWriter: Sorts the output in each bucket by the given columns.
sortBy(String, Seq<String>) - Method in class org.apache.spark.sql.DataFrameWriter: Sorts the output in each bucket by the given columns.
sortByKey() - Method in class org.apache.spark.api.java.JavaPairRDD: Sort the RDD by key, so that each partition contains a sorted range of the elements in ascending order.
sortByKey(boolean) - Method in class org.apache.spark.api.java.JavaPairRDD: Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortByKey(boolean, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortByKey(Comparator<K>) - Method in class org.apache.spark.api.java.JavaPairRDD: Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortByKey(Comparator<K>, boolean) - Method in class org.apache.spark.api.java.JavaPairRDD: Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortByKey(Comparator<K>, boolean, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortByKey(boolean, int) - Method in class org.apache.spark.rdd.OrderedRDDFunctions: Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortWithinPartitions(String, String...) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset with each partition sorted by the given expressions.
sortWithinPartitions(Column...) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset with each partition sorted by the given expressions.
sortWithinPartitions(String, Seq<String>) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset with each partition sorted by the given expressions.
sortWithinPartitions(Seq<Column>) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset with each partition sorted by the given expressions.
soundex(Column) - Static method in class org.apache.spark.sql.functions: Returns the soundex code for the specified expression.
Source - Interface in org.apache.spark.metrics.source
sourceName() - Static method in class org.apache.spark.metrics.source.CodegenMetrics
sourceName() - Static method in class org.apache.spark.metrics.source.HiveCatalogMetrics
sourceName() - Method in interface org.apache.spark.metrics.source.Source
SourceProgress - Class in org.apache.spark.sql.streaming: Information about progress made for a source in the execution of a StreamingQuery during a trigger.
sources() - Method in class org.apache.spark.sql.streaming.StreamingQueryProgress
sourceSchema(SQLContext, Option<StructType>, String, Map<String, String>) - Method in interface org.apache.spark.sql.sources.StreamSourceProvider: Returns the name and schema of the source that can be used to continually read data.
spark() - Method in class org.apache.spark.status.api.v1.VersionInfo
SPARK_CONNECTOR_NAME() - Static method in class org.apache.spark.ui.JettyUtils
SPARK_CONTEXT_SHUTDOWN_PRIORITY() - Static method in class org.apache.spark.util.ShutdownHookManager: The shutdown priority of the SparkContext instance.
SPARK_IO_ENCRYPTION_COMMONS_CONFIG_PREFIX() - Static method in class org.apache.spark.security.CryptoStreamUtils
SPARK_MASTER - Static variable in class org.apache.spark.launcher.SparkLauncher: The Spark master.
spark_partition_id() - Static method in class org.apache.spark.sql.functions: Partition ID.
SPARK_REGEX() - Static method in class org.apache.spark.SparkMasterRegex
SparkAppConfig(Seq<Tuple2<String, String>>, Option<byte[]>, Option<byte[]>) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.SparkAppConfig
SparkAppConfig$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.SparkAppConfig$
SparkAppHandle - Interface in org.apache.spark.launcher: A handle to a running Spark application.
SparkAppHandle.Listener - Interface in org.apache.spark.launcher: Listener for updates to a handle's state.
SparkAppHandle.State - Enum in org.apache.spark.launcher: Represents the application's state.
SparkAWSCredentials - Interface in org.apache.spark.streaming.kinesis: Serializable interface providing a method executors can call to obtain an AWSCredentialsProvider instance for authenticating to AWS services.
SparkAWSCredentials.Builder - Class in org.apache.spark.streaming.kinesis: Builder for SparkAWSCredentials instances.
SparkConf - Class in org.apache.spark: Configuration for a Spark application.
SparkConf(boolean) - Constructor for class org.apache.spark.SparkConf
SparkConf() - Constructor for class org.apache.spark.SparkConf: Create a SparkConf that loads defaults from system properties and the classpath
sparkContext() - Method in class org.apache.spark.rdd.RDD: The SparkContext that created this RDD.
SparkContext - Class in org.apache.spark: Main entry point for Spark functionality.
SparkContext(SparkConf) - Constructor for class org.apache.spark.SparkContext
SparkContext() - Constructor for class org.apache.spark.SparkContext: Create a SparkContext that loads settings from system properties (for instance, when launching with ./bin/spark-submit).
SparkContext(String, String, SparkConf) - Constructor for class org.apache.spark.SparkContext: Alternative constructor that allows setting common Spark properties directly
SparkContext(String, String, String, Seq<String>, Map<String, String>) - Constructor for class org.apache.spark.SparkContext: Alternative constructor that allows setting common Spark properties directly
sparkContext() - Method in class org.apache.spark.sql.SparkSession
sparkContext() - Method in class org.apache.spark.sql.SQLContext
sparkContext() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: The underlying SparkContext
sparkContext() - Method in class org.apache.spark.streaming.StreamingContext: Return the associated Spark context
SparkEnv - Class in org.apache.spark: :: DeveloperApi :: Holds all the runtime environment objects for a running Spark instance (either master or worker), including the serializer, RpcEnv, block manager, map output tracker, etc.
SparkEnv(String, org.apache.spark.rpc.RpcEnv, Serializer, Serializer, org.apache.spark.serializer.SerializerManager, MapOutputTracker, ShuffleManager, org.apache.spark.broadcast.BroadcastManager, org.apache.spark.storage.BlockManager, SecurityManager, org.apache.spark.metrics.MetricsSystem, MemoryManager, org.apache.spark.scheduler.OutputCommitCoordinator, SparkConf) - Constructor for class org.apache.spark.SparkEnv
sparkEventFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
sparkEventToJson(SparkListenerEvent) - Static method in class org.apache.spark.util.JsonProtocol: ------------------------------------------------- * JSON serialization methods for SparkListenerEvents |
SparkException - Exception in org.apache.spark
SparkException(String, Throwable) - Constructor for exception org.apache.spark.SparkException
SparkException(String) - Constructor for exception org.apache.spark.SparkException
SparkExecutorInfo - Interface in org.apache.spark: Exposes information about Spark Executors.
SparkExecutorInfoImpl - Class in org.apache.spark
SparkExecutorInfoImpl(String, int, long, int, long, long, long, long) - Constructor for class org.apache.spark.SparkExecutorInfoImpl
SparkExitCode - Class in org.apache.spark.util
SparkExitCode() - Constructor for class org.apache.spark.util.SparkExitCode
SparkFiles - Class in org.apache.spark: Resolves paths to files added through SparkContext.addFile().
SparkFiles() - Constructor for class org.apache.spark.SparkFiles
SparkFirehoseListener - Class in org.apache.spark: Class that allows users to receive all SparkListener events.
SparkFirehoseListener() - Constructor for class org.apache.spark.SparkFirehoseListener
SparkHadoopMapRedUtil - Class in org.apache.spark.mapred
SparkHadoopMapRedUtil() - Constructor for class org.apache.spark.mapred.SparkHadoopMapRedUtil
SparkHadoopWriter - Class in org.apache.spark.internal.io: A helper object that saves an RDD using a Hadoop OutputFormat.
SparkHadoopWriter() - Constructor for class org.apache.spark.internal.io.SparkHadoopWriter
SparkHadoopWriterUtils - Class in org.apache.spark.internal.io: A helper object that provide common utils used during saving an RDD using a Hadoop OutputFormat (both from the old mapred API and the new mapreduce API)
SparkHadoopWriterUtils() - Constructor for class org.apache.spark.internal.io.SparkHadoopWriterUtils
sparkJavaOpts(SparkConf, Function1<String, Object>) - Static method in class org.apache.spark.util.Utils: Convert all spark properties set in the given SparkConf to a sequence of java options.
SparkJobInfo - Interface in org.apache.spark: Exposes information about Spark Jobs.
SparkJobInfoImpl - Class in org.apache.spark
SparkJobInfoImpl(int, int[], JobExecutionStatus) - Constructor for class org.apache.spark.SparkJobInfoImpl
SparkLauncher - Class in org.apache.spark.launcher: Launcher for Spark applications.
SparkLauncher() - Constructor for class org.apache.spark.launcher.SparkLauncher
SparkLauncher(Map<String, String>) - Constructor for class org.apache.spark.launcher.SparkLauncher: Creates a launcher that will set the given environment variables in the child.
SparkListener - Class in org.apache.spark.scheduler: :: DeveloperApi :: A default implementation for SparkListenerInterface that has no-op implementations for all callbacks.
SparkListener() - Constructor for class org.apache.spark.scheduler.SparkListener
SparkListenerApplicationEnd - Class in org.apache.spark.scheduler
SparkListenerApplicationEnd(long) - Constructor for class org.apache.spark.scheduler.SparkListenerApplicationEnd
SparkListenerApplicationStart - Class in org.apache.spark.scheduler
SparkListenerApplicationStart(String, Option<String>, long, String, Option<String>, Option<Map<String, String>>) - Constructor for class org.apache.spark.scheduler.SparkListenerApplicationStart
SparkListenerBlockManagerAdded - Class in org.apache.spark.scheduler
SparkListenerBlockManagerAdded(long, BlockManagerId, long, Option<Object>, Option<Object>) - Constructor for class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
SparkListenerBlockManagerRemoved - Class in org.apache.spark.scheduler
SparkListenerBlockManagerRemoved(long, BlockManagerId) - Constructor for class org.apache.spark.scheduler.SparkListenerBlockManagerRemoved
SparkListenerBlockUpdated - Class in org.apache.spark.scheduler
SparkListenerBlockUpdated(BlockUpdatedInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerBlockUpdated
SparkListenerBus - Interface in org.apache.spark.scheduler: A SparkListenerEvent bus that relays SparkListenerEvents to its listeners
SparkListenerEnvironmentUpdate - Class in org.apache.spark.scheduler
SparkListenerEnvironmentUpdate(Map<String, Seq<Tuple2<String, String>>>) - Constructor for class org.apache.spark.scheduler.SparkListenerEnvironmentUpdate
SparkListenerEvent - Interface in org.apache.spark.scheduler
SparkListenerExecutorAdded - Class in org.apache.spark.scheduler
SparkListenerExecutorAdded(long, String, ExecutorInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerExecutorAdded
SparkListenerExecutorBlacklisted - Class in org.apache.spark.scheduler
SparkListenerExecutorBlacklisted(long, String, int) - Constructor for class org.apache.spark.scheduler.SparkListenerExecutorBlacklisted
SparkListenerExecutorBlacklistedForStage - Class in org.apache.spark.scheduler
SparkListenerExecutorBlacklistedForStage(long, String, int, int, int) - Constructor for class org.apache.spark.scheduler.SparkListenerExecutorBlacklistedForStage
SparkListenerExecutorMetricsUpdate - Class in org.apache.spark.scheduler: Periodic updates from executors.
SparkListenerExecutorMetricsUpdate(String, Seq<Tuple4<Object, Object, Object, Seq<AccumulableInfo>>>) - Constructor for class org.apache.spark.scheduler.SparkListenerExecutorMetricsUpdate
SparkListenerExecutorRemoved - Class in org.apache.spark.scheduler
SparkListenerExecutorRemoved(long, String, String) - Constructor for class org.apache.spark.scheduler.SparkListenerExecutorRemoved
SparkListenerExecutorUnblacklisted - Class in org.apache.spark.scheduler
SparkListenerExecutorUnblacklisted(long, String) - Constructor for class org.apache.spark.scheduler.SparkListenerExecutorUnblacklisted
SparkListenerInterface - Interface in org.apache.spark.scheduler: Interface for listening to events from the Spark scheduler.
SparkListenerJobEnd - Class in org.apache.spark.scheduler
SparkListenerJobEnd(int, long, JobResult) - Constructor for class org.apache.spark.scheduler.SparkListenerJobEnd
SparkListenerJobStart - Class in org.apache.spark.scheduler
SparkListenerJobStart(int, long, Seq<StageInfo>, Properties) - Constructor for class org.apache.spark.scheduler.SparkListenerJobStart
SparkListenerLogStart - Class in org.apache.spark.scheduler: An internal class that describes the metadata of an event log.
SparkListenerLogStart(String) - Constructor for class org.apache.spark.scheduler.SparkListenerLogStart
SparkListenerNodeBlacklisted - Class in org.apache.spark.scheduler
SparkListenerNodeBlacklisted(long, String, int) - Constructor for class org.apache.spark.scheduler.SparkListenerNodeBlacklisted
SparkListenerNodeBlacklistedForStage - Class in org.apache.spark.scheduler
SparkListenerNodeBlacklistedForStage(long, String, int, int, int) - Constructor for class org.apache.spark.scheduler.SparkListenerNodeBlacklistedForStage
SparkListenerNodeUnblacklisted - Class in org.apache.spark.scheduler
SparkListenerNodeUnblacklisted(long, String) - Constructor for class org.apache.spark.scheduler.SparkListenerNodeUnblacklisted
SparkListenerSpeculativeTaskSubmitted - Class in org.apache.spark.scheduler
SparkListenerSpeculativeTaskSubmitted(int) - Constructor for class org.apache.spark.scheduler.SparkListenerSpeculativeTaskSubmitted
SparkListenerStageCompleted - Class in org.apache.spark.scheduler
SparkListenerStageCompleted(StageInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerStageCompleted
SparkListenerStageSubmitted - Class in org.apache.spark.scheduler
SparkListenerStageSubmitted(StageInfo, Properties) - Constructor for class org.apache.spark.scheduler.SparkListenerStageSubmitted
SparkListenerTaskEnd - Class in org.apache.spark.scheduler
SparkListenerTaskEnd(int, int, String, TaskEndReason, TaskInfo, TaskMetrics) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskEnd
SparkListenerTaskGettingResult - Class in org.apache.spark.scheduler
SparkListenerTaskGettingResult(TaskInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskGettingResult
SparkListenerTaskStart - Class in org.apache.spark.scheduler
SparkListenerTaskStart(int, int, TaskInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskStart
SparkListenerUnpersistRDD - Class in org.apache.spark.scheduler
SparkListenerUnpersistRDD(int) - Constructor for class org.apache.spark.scheduler.SparkListenerUnpersistRDD
SparkMasterRegex - Class in org.apache.spark: A collection of regexes for extracting information from the master string.
SparkMasterRegex() - Constructor for class org.apache.spark.SparkMasterRegex
sparkProperties() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.SparkAppConfig
sparkProperties() - Method in class org.apache.spark.status.api.v1.ApplicationEnvironmentInfo
SparkRDefaults - Class in org.apache.spark.api.r
SparkRDefaults() - Constructor for class org.apache.spark.api.r.SparkRDefaults
sparkRPackagePath(boolean) - Static method in class org.apache.spark.api.r.RUtils: Get the list of paths for R packages in various deployment modes, of which the first path is for the SparkR package itself.
sparkSession() - Method in interface org.apache.spark.ml.classification.BinaryLogisticRegressionSummary
sparkSession() - Method in interface org.apache.spark.ml.util.BaseReadWrite: Returns the user-specified Spark Session or the default.
sparkSession() - Method in class org.apache.spark.sql.Dataset
sparkSession() - Method in interface org.apache.spark.sql.hive.HiveStrategies
SparkSession - Class in org.apache.spark.sql: The entry point to programming Spark with the Dataset and DataFrame API.
sparkSession() - Method in class org.apache.spark.sql.SQLContext
sparkSession() - Method in interface org.apache.spark.sql.streaming.StreamingQuery: Returns the SparkSession associated with this.
SparkSession.Builder - Class in org.apache.spark.sql: Builder for SparkSession.
SparkSession.implicits$ - Class in org.apache.spark.sql: :: Experimental :: (Scala-specific) Implicit methods available in Scala for converting common Scala objects into DataFrames.
SparkSessionExtensions - Class in org.apache.spark.sql: :: Experimental :: Holder for injection points to the SparkSession.
SparkSessionExtensions() - Constructor for class org.apache.spark.sql.SparkSessionExtensions
SparkShutdownHook - Class in org.apache.spark.util
SparkShutdownHook(int, Function0<BoxedUnit>) - Constructor for class org.apache.spark.util.SparkShutdownHook
SparkStageInfo - Interface in org.apache.spark: Exposes information about Spark Stages.
SparkStageInfoImpl - Class in org.apache.spark
SparkStageInfoImpl(int, int, long, String, int, int, int, int) - Constructor for class org.apache.spark.SparkStageInfoImpl
SparkStatusTracker - Class in org.apache.spark: Low-level status reporting APIs for monitoring job and stage progress.
sparkUser() - Method in class org.apache.spark.api.java.JavaSparkContext
sparkUser() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
sparkUser() - Method in class org.apache.spark.SparkContext
sparkUser() - Method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
SparkUserDefinedFunction - Class in org.apache.spark.sql.expressions
SparkUserDefinedFunction() - Constructor for class org.apache.spark.sql.expressions.SparkUserDefinedFunction
sparkVersion() - Method in class org.apache.spark.scheduler.SparkListenerLogStart
sparse(int, int, int[], int[], double[]) - Static method in class org.apache.spark.ml.linalg.Matrices: Creates a column-major sparse matrix in Compressed Sparse Column (CSC) format.
sparse(int, int[], double[]) - Static method in class org.apache.spark.ml.linalg.Vectors: Creates a sparse vector providing its index array and value array.
sparse(int, Seq<Tuple2<Object, Object>>) - Static method in class org.apache.spark.ml.linalg.Vectors: Creates a sparse vector using unordered (index, value) pairs.
sparse(int, Iterable<Tuple2<Integer, Double>>) - Static method in class org.apache.spark.ml.linalg.Vectors: Creates a sparse vector using unordered (index, value) pairs in a Java friendly way.
sparse(int, int, int[], int[], double[]) - Static method in class org.apache.spark.mllib.linalg.Matrices: Creates a column-major sparse matrix in Compressed Sparse Column (CSC) format.
sparse(int, int[], double[]) - Static method in class org.apache.spark.mllib.linalg.Vectors: Creates a sparse vector providing its index array and value array.
sparse(int, Seq<Tuple2<Object, Object>>) - Static method in class org.apache.spark.mllib.linalg.Vectors: Creates a sparse vector using unordered (index, value) pairs.
sparse(int, Iterable<Tuple2<Integer, Double>>) - Static method in class org.apache.spark.mllib.linalg.Vectors: Creates a sparse vector using unordered (index, value) pairs in a Java friendly way.
SparseMatrix - Class in org.apache.spark.ml.linalg: Column-major sparse matrix.
SparseMatrix(int, int, int[], int[], double[], boolean) - Constructor for class org.apache.spark.ml.linalg.SparseMatrix
SparseMatrix(int, int, int[], int[], double[]) - Constructor for class org.apache.spark.ml.linalg.SparseMatrix: Column-major sparse matrix.
SparseMatrix - Class in org.apache.spark.mllib.linalg: Column-major sparse matrix.
SparseMatrix(int, int, int[], int[], double[], boolean) - Constructor for class org.apache.spark.mllib.linalg.SparseMatrix
SparseMatrix(int, int, int[], int[], double[]) - Constructor for class org.apache.spark.mllib.linalg.SparseMatrix: Column-major sparse matrix.
SparseVector - Class in org.apache.spark.ml.linalg: A sparse vector represented by an index array and a value array.
SparseVector(int, int[], double[]) - Constructor for class org.apache.spark.ml.linalg.SparseVector
SparseVector - Class in org.apache.spark.mllib.linalg: A sparse vector represented by an index array and a value array.
SparseVector(int, int[], double[]) - Constructor for class org.apache.spark.mllib.linalg.SparseVector
SPARSITY() - Static method in class org.apache.spark.ml.attribute.AttributeKeys
sparsity() - Method in class org.apache.spark.ml.attribute.NumericAttribute
spdiag(Vector) - Static method in class org.apache.spark.ml.linalg.SparseMatrix: Generate a diagonal matrix in SparseMatrix format from the supplied values.
spdiag(Vector) - Static method in class org.apache.spark.mllib.linalg.SparseMatrix: Generate a diagonal matrix in SparseMatrix format from the supplied values.
SpearmanCorrelation - Class in org.apache.spark.mllib.stat.correlation: Compute Spearman's correlation for two RDDs of the type RDD[Double] or the correlation matrix for an RDD of the type RDD[Vector].
SpearmanCorrelation() - Constructor for class org.apache.spark.mllib.stat.correlation.SpearmanCorrelation
SpecialLengths - Class in org.apache.spark.api.r
SpecialLengths() - Constructor for class org.apache.spark.api.r.SpecialLengths
speculative() - Method in class org.apache.spark.scheduler.TaskInfo
speculative() - Method in class org.apache.spark.status.api.v1.TaskData
speye(int) - Static method in class org.apache.spark.ml.linalg.Matrices: Generate a sparse Identity Matrix in Matrix format.
speye(int) - Static method in class org.apache.spark.ml.linalg.SparseMatrix: Generate an Identity Matrix in SparseMatrix format.
speye(int) - Static method in class org.apache.spark.mllib.linalg.Matrices: Generate a sparse Identity Matrix in Matrix format.
speye(int) - Static method in class org.apache.spark.mllib.linalg.SparseMatrix: Generate an Identity Matrix in SparseMatrix format.
SpillListener - Class in org.apache.spark: A SparkListener that detects whether spills have occurred in Spark jobs.
SpillListener() - Constructor for class org.apache.spark.SpillListener
split() - Method in class org.apache.spark.ml.tree.DecisionTreeModelReadWrite.NodeData
split() - Method in class org.apache.spark.ml.tree.InternalNode
Split - Interface in org.apache.spark.ml.tree: Interface for a "Split," which specifies a test made at a decision tree node to choose the left or right path.
split() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
split() - Method in class org.apache.spark.mllib.tree.model.Node
Split - Class in org.apache.spark.mllib.tree.model: :: DeveloperApi :: Split applied to a feature param: feature feature index param: threshold Threshold for continuous feature.
Split(int, double, Enumeration.Value, List<Object>) - Constructor for class org.apache.spark.mllib.tree.model.Split
split(Column, String) - Static method in class org.apache.spark.sql.functions: Splits str around pattern (pattern is a regular expression).
splitAndCountPartitions(Iterator<String>) - Static method in class org.apache.spark.streaming.util.RawTextHelper: Splits lines and counts the words.
splitCommandString(String) - Static method in class org.apache.spark.util.Utils: Split a string of potentially quoted arguments from the command line the way that a shell would do it to determine arguments to a command.
SplitData(int, double[], int) - Constructor for class org.apache.spark.ml.tree.DecisionTreeModelReadWrite.SplitData
SplitData(int, double, int, Seq<Object>) - Constructor for class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData
SplitData$() - Constructor for class org.apache.spark.ml.tree.DecisionTreeModelReadWrite.SplitData$
SplitData$() - Constructor for class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData$
splitIndex() - Method in class org.apache.spark.storage.RDDBlockId
SplitInfo - Class in org.apache.spark.scheduler
SplitInfo(Class<?>, String, String, long, Object) - Constructor for class org.apache.spark.scheduler.SplitInfo
splits() - Method in class org.apache.spark.ml.feature.Bucketizer: Parameter for mapping continuous features into buckets.
splitsArray() - Method in class org.apache.spark.ml.feature.Bucketizer: Parameter for specifying multiple splits parameters.
spr(double, Vector, DenseVector) - Static method in class org.apache.spark.ml.linalg.BLAS: Adds alpha * x * x.t to a matrix in-place.
spr(double, Vector, double[]) - Static method in class org.apache.spark.ml.linalg.BLAS: Adds alpha * x * x.t to a matrix in-place.
spr(double, Vector, DenseVector) - Static method in class org.apache.spark.mllib.linalg.BLAS: Adds alpha * v * v.t to a matrix in-place.
spr(double, Vector, double[]) - Static method in class org.apache.spark.mllib.linalg.BLAS: Adds alpha * v * v.t to a matrix in-place.
sprand(int, int, double, Random) - Static method in class org.apache.spark.ml.linalg.Matrices: Generate a SparseMatrix consisting of i.i.d. gaussian random numbers.
sprand(int, int, double, Random) - Static method in class org.apache.spark.ml.linalg.SparseMatrix: Generate a SparseMatrix consisting of i.i.d.
sprand(int, int, double, Random) - Static method in class org.apache.spark.mllib.linalg.Matrices: Generate a SparseMatrix consisting of i.i.d. gaussian random numbers.
sprand(int, int, double, Random) - Static method in class org.apache.spark.mllib.linalg.SparseMatrix: Generate a SparseMatrix consisting of i.i.d.
sprandn(int, int, double, Random) - Static method in class org.apache.spark.ml.linalg.Matrices: Generate a SparseMatrix consisting of i.i.d. gaussian random numbers.
sprandn(int, int, double, Random) - Static method in class org.apache.spark.ml.linalg.SparseMatrix: Generate a SparseMatrix consisting of i.i.d.
sprandn(int, int, double, Random) - Static method in class org.apache.spark.mllib.linalg.Matrices: Generate a SparseMatrix consisting of i.i.d. gaussian random numbers.
sprandn(int, int, double, Random) - Static method in class org.apache.spark.mllib.linalg.SparseMatrix: Generate a SparseMatrix consisting of i.i.d.
sqdist(Vector, Vector) - Static method in class org.apache.spark.ml.linalg.Vectors: Returns the squared distance between two Vectors.
sqdist(Vector, Vector) - Static method in class org.apache.spark.mllib.linalg.Vectors: Returns the squared distance between two Vectors.
sql(String) - Method in class org.apache.spark.sql.SparkSession: Executes a SQL query using Spark, returning the result as a DataFrame.
sql(String) - Method in class org.apache.spark.sql.SQLContext
sql() - Method in class org.apache.spark.sql.types.ArrayType
sql() - Static method in class org.apache.spark.sql.types.BinaryType
sql() - Static method in class org.apache.spark.sql.types.BooleanType
sql() - Static method in class org.apache.spark.sql.types.ByteType
sql() - Static method in class org.apache.spark.sql.types.CalendarIntervalType
sql() - Method in class org.apache.spark.sql.types.DataType
sql() - Static method in class org.apache.spark.sql.types.DateType
sql() - Method in class org.apache.spark.sql.types.DecimalType
sql() - Static method in class org.apache.spark.sql.types.DoubleType
sql() - Static method in class org.apache.spark.sql.types.FloatType
sql() - Static method in class org.apache.spark.sql.types.IntegerType
sql() - Static method in class org.apache.spark.sql.types.LongType
sql() - Method in class org.apache.spark.sql.types.MapType
sql() - Static method in class org.apache.spark.sql.types.NullType
sql() - Static method in class org.apache.spark.sql.types.ShortType
sql() - Static method in class org.apache.spark.sql.types.StringType
sql() - Method in class org.apache.spark.sql.types.StructType
sql() - Static method in class org.apache.spark.sql.types.TimestampType
sqlContext() - Method in interface org.apache.spark.ml.util.BaseReadWrite: Returns the user-specified SQL context or the default.
sqlContext() - Method in class org.apache.spark.sql.Dataset
sqlContext() - Method in class org.apache.spark.sql.sources.BaseRelation
sqlContext() - Method in class org.apache.spark.sql.SparkSession: A wrapped version of this session in the form of a SQLContext, for backward compatibility.
SQLContext - Class in org.apache.spark.sql: The entry point for working with structured data (rows and columns) in Spark 1.x.
SQLContext(SparkContext) - Constructor for class org.apache.spark.sql.SQLContext: Deprecated.
Use SparkSession.builder instead. Since 2.0.0.
SQLContext(JavaSparkContext) - Constructor for class org.apache.spark.sql.SQLContext: Deprecated.
Use SparkSession.builder instead. Since 2.0.0.
SQLContext.implicits$ - Class in org.apache.spark.sql: :: Experimental :: (Scala-specific) Implicit methods available in Scala for converting common Scala objects into DataFrames.
SQLDataTypes - Class in org.apache.spark.ml.linalg: :: DeveloperApi :: SQL data types for vectors and matrices.
SQLDataTypes() - Constructor for class org.apache.spark.ml.linalg.SQLDataTypes
SQLImplicits - Class in org.apache.spark.sql: A collection of implicit methods for converting common Scala objects into Datasets.
SQLImplicits() - Constructor for class org.apache.spark.sql.SQLImplicits
SQLImplicits.StringToColumn - Class in org.apache.spark.sql: Converts $"col name" into a Column.
SQLTransformer - Class in org.apache.spark.ml.feature: Implements the transformations which are defined by SQL statement.
SQLTransformer(String) - Constructor for class org.apache.spark.ml.feature.SQLTransformer
SQLTransformer() - Constructor for class org.apache.spark.ml.feature.SQLTransformer
sqlType() - Method in class org.apache.spark.mllib.linalg.VectorUDT
SQLUserDefinedType - Annotation Type in org.apache.spark.sql.types: ::DeveloperApi:: A user-defined type which can be automatically recognized by a SQLContext and registered.
SQLUtils - Class in org.apache.spark.sql.api.r
SQLUtils() - Constructor for class org.apache.spark.sql.api.r.SQLUtils
sqrt(Column) - Static method in class org.apache.spark.sql.functions: Computes the square root of the specified float value.
sqrt(String) - Static method in class org.apache.spark.sql.functions: Computes the square root of the specified float value.
Sqrt$() - Constructor for class org.apache.spark.ml.regression.GeneralizedLinearRegression.Sqrt$
SquaredError - Class in org.apache.spark.mllib.tree.loss: :: DeveloperApi :: Class for squared error loss calculation.
SquaredError() - Constructor for class org.apache.spark.mllib.tree.loss.SquaredError
SquaredEuclideanSilhouette - Class in org.apache.spark.ml.evaluation: SquaredEuclideanSilhouette computes the average of the Silhouette over all the data of the dataset, which is a measure of how appropriately the data have been clustered.
SquaredEuclideanSilhouette() - Constructor for class org.apache.spark.ml.evaluation.SquaredEuclideanSilhouette
SquaredEuclideanSilhouette.ClusterStats - Class in org.apache.spark.ml.evaluation
SquaredEuclideanSilhouette.ClusterStats$ - Class in org.apache.spark.ml.evaluation
SquaredL2Updater - Class in org.apache.spark.mllib.optimization: :: DeveloperApi :: Updater for L2 regularized problems.
SquaredL2Updater() - Constructor for class org.apache.spark.mllib.optimization.SquaredL2Updater
squaredNormSum() - Method in class org.apache.spark.ml.evaluation.SquaredEuclideanSilhouette.ClusterStats
Src - Static variable in class org.apache.spark.graphx.TripletFields: Expose the source and edge fields but not the destination field.
srcAttr() - Method in class org.apache.spark.graphx.EdgeContext: The vertex attribute of the edge's source vertex.
srcAttr() - Method in class org.apache.spark.graphx.EdgeTriplet: The source vertex attribute
srcAttr() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
srcCol() - Method in interface org.apache.spark.ml.clustering.PowerIterationClusteringParams: Param for the name of the input column for source vertex IDs.
srcId() - Method in class org.apache.spark.graphx.Edge
srcId() - Method in class org.apache.spark.graphx.EdgeContext: The vertex id of the edge's source vertex.
srcId() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
srdd() - Method in class org.apache.spark.api.java.JavaDoubleRDD
ssc() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
stackTrace() - Method in class org.apache.spark.ExceptionFailure
StackTrace - Class in org.apache.spark.status.api.v1
StackTrace(Seq<String>) - Constructor for class org.apache.spark.status.api.v1.StackTrace
stackTrace() - Method in class org.apache.spark.status.api.v1.ThreadStackTrace
stackTraceFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
stackTraceToJson(StackTraceElement[]) - Static method in class org.apache.spark.util.JsonProtocol
stage() - Method in class org.apache.spark.scheduler.AskPermissionToCommitOutput
STAGE() - Static method in class org.apache.spark.status.TaskIndexNames
STAGE_DAG() - Static method in class org.apache.spark.ui.ToolTips
STAGE_TIMELINE() - Static method in class org.apache.spark.ui.ToolTips
stageAttempt() - Method in class org.apache.spark.scheduler.AskPermissionToCommitOutput
stageAttemptId() - Method in class org.apache.spark.ContextBarrierId
stageAttemptId() - Method in class org.apache.spark.scheduler.SparkListenerExecutorBlacklistedForStage
stageAttemptId() - Method in class org.apache.spark.scheduler.SparkListenerNodeBlacklistedForStage
stageAttemptId() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
stageAttemptId() - Method in class org.apache.spark.scheduler.SparkListenerTaskStart
stageAttemptNumber() - Method in class org.apache.spark.BarrierTaskContext
stageAttemptNumber() - Method in class org.apache.spark.TaskContext: How many times the stage that this task belongs to has been attempted.
stageCompletedFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
stageCompletedToJson(SparkListenerStageCompleted) - Static method in class org.apache.spark.util.JsonProtocol
StageData - Class in org.apache.spark.status.api.v1
stageFailed(String) - Method in class org.apache.spark.scheduler.StageInfo
stageId() - Method in class org.apache.spark.BarrierTaskContext
stageId() - Method in class org.apache.spark.ContextBarrierId
stageId() - Method in interface org.apache.spark.scheduler.Schedulable
stageId() - Method in class org.apache.spark.scheduler.SparkListenerExecutorBlacklistedForStage
stageId() - Method in class org.apache.spark.scheduler.SparkListenerNodeBlacklistedForStage
stageId() - Method in class org.apache.spark.scheduler.SparkListenerSpeculativeTaskSubmitted
stageId() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
stageId() - Method in class org.apache.spark.scheduler.SparkListenerTaskStart
stageId() - Method in class org.apache.spark.scheduler.StageInfo
stageId() - Method in interface org.apache.spark.SparkStageInfo
stageId() - Method in class org.apache.spark.SparkStageInfoImpl
stageId() - Method in class org.apache.spark.status.api.v1.StageData
stageId() - Method in class org.apache.spark.TaskContext: The ID of the stage that this task belong to.
stageIds() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
stageIds() - Method in interface org.apache.spark.SparkJobInfo
stageIds() - Method in class org.apache.spark.SparkJobInfoImpl
stageIds() - Method in class org.apache.spark.status.api.v1.JobData
stageIds() - Method in class org.apache.spark.status.LiveJob
stageIds() - Method in class org.apache.spark.status.SchedulerPool
stageInfo() - Method in class org.apache.spark.scheduler.SparkListenerStageCompleted
stageInfo() - Method in class org.apache.spark.scheduler.SparkListenerStageSubmitted
StageInfo - Class in org.apache.spark.scheduler: :: DeveloperApi :: Stores information about a stage to pass from the scheduler to SparkListeners.
StageInfo(int, int, String, int, Seq<RDDInfo>, Seq<Object>, String, TaskMetrics, Seq<Seq<TaskLocation>>) - Constructor for class org.apache.spark.scheduler.StageInfo
stageInfoFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol: --------------------------------------------------------------------- * JSON deserialization methods for classes SparkListenerEvents depend on |
stageInfos() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
stageInfoToJson(StageInfo) - Static method in class org.apache.spark.util.JsonProtocol: ------------------------------------------------------------------- * JSON serialization methods for classes SparkListenerEvents depend on |
stageName() - Method in class org.apache.spark.ml.clustering.InternalKMeansModelWriter
stageName() - Method in class org.apache.spark.ml.clustering.PMMLKMeansModelWriter
stageName() - Method in class org.apache.spark.ml.regression.InternalLinearRegressionModelWriter
stageName() - Method in class org.apache.spark.ml.regression.PMMLLinearRegressionModelWriter
stageName() - Method in interface org.apache.spark.ml.util.MLFormatRegister: The string that represents the stage type that this writer supports.
stages() - Method in class org.apache.spark.ml.Pipeline: param for pipeline stages
stages() - Method in class org.apache.spark.ml.PipelineModel
StageStatus - Enum in org.apache.spark.status.api.v1
stageSubmittedFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
stageSubmittedToJson(SparkListenerStageSubmitted) - Static method in class org.apache.spark.util.JsonProtocol
standardization() - Method in interface org.apache.spark.ml.param.shared.HasStandardization: Param for whether to standardize the training features before fitting the model.
StandardNormalGenerator - Class in org.apache.spark.mllib.random: :: DeveloperApi :: Generates i.i.d.
StandardNormalGenerator() - Constructor for class org.apache.spark.mllib.random.StandardNormalGenerator
StandardScaler - Class in org.apache.spark.ml.feature: Standardizes features by removing the mean and scaling to unit variance using column summary statistics on the samples in the training set.
StandardScaler(String) - Constructor for class org.apache.spark.ml.feature.StandardScaler
StandardScaler() - Constructor for class org.apache.spark.ml.feature.StandardScaler
StandardScaler - Class in org.apache.spark.mllib.feature: Standardizes features by removing the mean and scaling to unit std using column summary statistics on the samples in the training set.
StandardScaler(boolean, boolean) - Constructor for class org.apache.spark.mllib.feature.StandardScaler
StandardScaler() - Constructor for class org.apache.spark.mllib.feature.StandardScaler
StandardScalerModel - Class in org.apache.spark.ml.feature: Model fitted by StandardScaler.
StandardScalerModel - Class in org.apache.spark.mllib.feature: Represents a StandardScaler model that can transform vectors.
StandardScalerModel(Vector, Vector, boolean, boolean) - Constructor for class org.apache.spark.mllib.feature.StandardScalerModel
StandardScalerModel(Vector, Vector) - Constructor for class org.apache.spark.mllib.feature.StandardScalerModel
StandardScalerModel(Vector) - Constructor for class org.apache.spark.mllib.feature.StandardScalerModel
StandardScalerParams - Interface in org.apache.spark.ml.feature: Params for StandardScaler and StandardScalerModel.
starGraph(SparkContext, int) - Static method in class org.apache.spark.graphx.util.GraphGenerators: Create a star graph with vertex 0 being the center.
start() - Method in interface org.apache.spark.metrics.sink.Sink
start() - Method in interface org.apache.spark.scheduler.SchedulerBackend
start() - Method in interface org.apache.spark.scheduler.TaskScheduler
start(String) - Method in class org.apache.spark.sql.streaming.DataStreamWriter: Starts the execution of the streaming query, which will continually output results to the given path as new data arrives.
start() - Method in class org.apache.spark.sql.streaming.DataStreamWriter: Starts the execution of the streaming query, which will continually output results to the given path as new data arrives.
start() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Start the execution of the streams.
start() - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
start() - Method in class org.apache.spark.streaming.dstream.InputDStream: Method called to start receiving data.
start() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
start() - Method in class org.apache.spark.streaming.StreamingContext: Start the execution of the streams.
startApplication(SparkAppHandle.Listener...) - Method in class org.apache.spark.launcher.AbstractLauncher: Starts a Spark application.
startApplication(SparkAppHandle.Listener...) - Method in class org.apache.spark.launcher.InProcessLauncher: Starts a Spark application.
startApplication(SparkAppHandle.Listener...) - Method in class org.apache.spark.launcher.SparkLauncher: Starts a Spark application.
startIndexInLevel(int) - Static method in class org.apache.spark.mllib.tree.model.Node: Return the index of the first node in the given level.
startJettyServer(String, int, org.apache.spark.SSLOptions, Seq<ServletContextHandler>, SparkConf, String) - Static method in class org.apache.spark.ui.JettyUtils: Attempt to start a Jetty server bound to the supplied hostName:port using the given context handlers.
startOffset() - Method in class org.apache.spark.sql.streaming.SourceProgress
startOffset() - Method in exception org.apache.spark.sql.streaming.StreamingQueryException
startPosition() - Method in exception org.apache.spark.sql.AnalysisException
startServiceOnPort(int, Function1<Object, Tuple2<T, Object>>, SparkConf, String) - Static method in class org.apache.spark.util.Utils: Attempt to start a service on the given port, or fail after a number of attempts.
startsWith(Column) - Method in class org.apache.spark.sql.Column: String starts with.
startsWith(String) - Method in class org.apache.spark.sql.Column: String starts with another string literal.
startTime() - Method in class org.apache.spark.api.java.JavaSparkContext
startTime() - Method in class org.apache.spark.SparkContext
startTime() - Method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
startTime() - Method in class org.apache.spark.status.api.v1.streaming.OutputOperationInfo
startTime() - Method in class org.apache.spark.status.api.v1.streaming.StreamingStatistics
startTime() - Method in class org.apache.spark.streaming.scheduler.OutputOperationInfo
stat() - Method in class org.apache.spark.sql.Dataset: Returns a DataFrameStatFunctions for working statistic functions support.
StatCounter - Class in org.apache.spark.util: A class for tracking the statistics of a set of numbers (count, mean and variance) in a numerically robust way.
StatCounter(TraversableOnce<Object>) - Constructor for class org.apache.spark.util.StatCounter
StatCounter() - Constructor for class org.apache.spark.util.StatCounter: Initialize the StatCounter with no values.
state() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate
state() - Method in class org.apache.spark.scheduler.local.StatusUpdate
State<S> - Class in org.apache.spark.streaming: :: Experimental :: Abstract class for getting and updating the state in mapping function used in the mapWithState operation of a pair DStream (Scala) or a JavaPairDStream (Java).
State() - Constructor for class org.apache.spark.streaming.State
stateChanged(SparkAppHandle) - Method in interface org.apache.spark.launcher.SparkAppHandle.Listener: Callback for changes in the handle's state.
statement() - Method in class org.apache.spark.ml.feature.SQLTransformer: SQL statement parameter.
StateOperatorProgress - Class in org.apache.spark.sql.streaming: Information about updates made to stateful operators in a StreamingQuery during a trigger.
stateOperators() - Method in class org.apache.spark.sql.streaming.StreamingQueryProgress
stateSnapshots() - Method in class org.apache.spark.streaming.api.java.JavaMapWithStateDStream
stateSnapshots() - Method in class org.apache.spark.streaming.dstream.MapWithStateDStream: Return a pair DStream where each RDD is the snapshot of the state of all the keys.
StateSpec<KeyType,ValueType,StateType,MappedType> - Class in org.apache.spark.streaming: :: Experimental :: Abstract class representing all the specifications of the DStream transformation mapWithState operation of a pair DStream (Scala) or a JavaPairDStream (Java).
StateSpec() - Constructor for class org.apache.spark.streaming.StateSpec
staticPageRank(int, double) - Method in class org.apache.spark.graphx.GraphOps: Run PageRank for a fixed number of iterations returning a graph with vertex attributes containing the PageRank and edge attributes the normalized edge weight.
staticParallelPersonalizedPageRank(long[], int, double) - Method in class org.apache.spark.graphx.GraphOps: Run parallel personalized PageRank for a given array of source vertices, such that all random walks are started relative to the source vertices
staticPersonalizedPageRank(long, int, double) - Method in class org.apache.spark.graphx.GraphOps: Run Personalized PageRank for a fixed number of iterations with with all iterations originating at the source node returning a graph with vertex attributes containing the PageRank and edge attributes the normalized edge weight.
StaticSources - Class in org.apache.spark.metrics.source
StaticSources() - Constructor for class org.apache.spark.metrics.source.StaticSources
statistic() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
statistic() - Method in class org.apache.spark.mllib.stat.test.KolmogorovSmirnovTestResult
statistic() - Method in interface org.apache.spark.mllib.stat.test.TestResult: Test statistic.
Statistics - Class in org.apache.spark.mllib.stat: API for statistical functions in MLlib.
Statistics() - Constructor for class org.apache.spark.mllib.stat.Statistics
Statistics - Interface in org.apache.spark.sql.sources.v2.reader: An interface to represent statistics for a data source, which is returned by SupportsReportStatistics.estimateStatistics().
stats() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a StatCounter object that captures the mean, variance and count of the RDD's elements in one operation.
stats() - Method in class org.apache.spark.mllib.tree.model.Node
stats() - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Return a StatCounter object that captures the mean, variance and count of the RDD's elements in one operation.
StatsdMetricType - Class in org.apache.spark.metrics.sink
StatsdMetricType() - Constructor for class org.apache.spark.metrics.sink.StatsdMetricType
StatsReportListener - Class in org.apache.spark.scheduler: :: DeveloperApi :: Simple SparkListener that logs a few summary statistics when each stage completes.
StatsReportListener() - Constructor for class org.apache.spark.scheduler.StatsReportListener
StatsReportListener - Class in org.apache.spark.streaming.scheduler: :: DeveloperApi :: A simple StreamingListener that logs summary statistics across Spark Streaming batches param: numBatchInfos Number of last batches to consider for generating statistics (default: 10)
StatsReportListener(int) - Constructor for class org.apache.spark.streaming.scheduler.StatsReportListener
status() - Method in class org.apache.spark.scheduler.TaskInfo
status() - Method in interface org.apache.spark.SparkJobInfo
status() - Method in class org.apache.spark.SparkJobInfoImpl
status() - Method in interface org.apache.spark.sql.streaming.StreamingQuery: Returns the current status of the query.
status() - Method in class org.apache.spark.status.api.v1.JobData
status() - Method in class org.apache.spark.status.api.v1.StageData
status() - Method in class org.apache.spark.status.api.v1.streaming.BatchInfo
status() - Method in class org.apache.spark.status.api.v1.TaskData
status() - Method in class org.apache.spark.status.LiveJob
status() - Method in class org.apache.spark.status.LiveStage
STATUS() - Static method in class org.apache.spark.status.TaskIndexNames
status() - Method in class org.apache.spark.storage.BlockManagerMessages.BlockLocationsAndStatus
statusTracker() - Method in class org.apache.spark.api.java.JavaSparkContext
statusTracker() - Method in class org.apache.spark.SparkContext
StatusUpdate(String, long, Enumeration.Value, org.apache.spark.util.SerializableBuffer) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate
StatusUpdate - Class in org.apache.spark.scheduler.local
StatusUpdate(long, Enumeration.Value, ByteBuffer) - Constructor for class org.apache.spark.scheduler.local.StatusUpdate
StatusUpdate$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate$
STD() - Static method in class org.apache.spark.ml.attribute.AttributeKeys
std() - Method in class org.apache.spark.ml.attribute.NumericAttribute
std() - Method in class org.apache.spark.ml.feature.StandardScalerModel
std() - Method in class org.apache.spark.mllib.feature.StandardScalerModel
std() - Method in class org.apache.spark.mllib.random.LogNormalGenerator
stddev(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: alias for stddev_samp.
stddev(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: alias for stddev_samp.
stddev_pop(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the population standard deviation of the expression in a group.
stddev_pop(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the population standard deviation of the expression in a group.
stddev_samp(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the sample standard deviation of the expression in a group.
stddev_samp(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the sample standard deviation of the expression in a group.
stdev() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Compute the population standard deviation of this RDD's elements.
stdev() - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Compute the population standard deviation of this RDD's elements.
stdev() - Method in class org.apache.spark.util.StatCounter: Return the population standard deviation of the values.
stepSize() - Method in interface org.apache.spark.ml.param.shared.HasStepSize: Param for Step size to be used for each iteration of optimization (> 0).
stepSize() - Method in interface org.apache.spark.ml.tree.GBTParams: Param for Step size (a.k.a.
stop() - Method in class org.apache.spark.api.java.JavaSparkContext: Shut down the SparkContext.
stop() - Method in interface org.apache.spark.broadcast.BroadcastFactory
stop() - Method in interface org.apache.spark.launcher.SparkAppHandle: Asks the application to stop.
stop() - Method in interface org.apache.spark.metrics.sink.Sink
stop() - Method in interface org.apache.spark.rpc.RpcEndpoint: A convenient method to stop RpcEndpoint.
stop() - Method in interface org.apache.spark.scheduler.SchedulerBackend
stop() - Method in interface org.apache.spark.scheduler.TaskScheduler
stop() - Method in class org.apache.spark.SparkContext: Shut down the SparkContext.
stop() - Method in class org.apache.spark.sql.SparkSession: Stop the underlying SparkContext.
stop() - Method in interface org.apache.spark.sql.streaming.StreamingQuery: Stops the execution of this query if it is running.
stop() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Stop the execution of the streams.
stop(boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Stop the execution of the streams.
stop(boolean, boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Stop the execution of the streams.
stop() - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
stop() - Method in class org.apache.spark.streaming.dstream.InputDStream: Method called to stop receiving data.
stop() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
stop(String) - Method in class org.apache.spark.streaming.receiver.Receiver: Stop the receiver completely.
stop(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver: Stop the receiver completely due to an exception
stop(boolean) - Method in class org.apache.spark.streaming.StreamingContext: Stop the execution of the streams immediately (does not wait for all received data to be processed).
stop(boolean, boolean) - Method in class org.apache.spark.streaming.StreamingContext: Stop the execution of the streams, with option of ensuring all received data has been processed.
StopAllReceivers - Class in org.apache.spark.streaming.scheduler: This message will trigger ReceiverTrackerEndpoint to send stop signals to all registered receivers.
StopAllReceivers() - Constructor for class org.apache.spark.streaming.scheduler.StopAllReceivers
StopBlockManagerMaster$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.StopBlockManagerMaster$
StopCoordinator - Class in org.apache.spark.scheduler
StopCoordinator() - Constructor for class org.apache.spark.scheduler.StopCoordinator
StopDriver$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopDriver$
StopExecutor - Class in org.apache.spark.scheduler.local
StopExecutor() - Constructor for class org.apache.spark.scheduler.local.StopExecutor
StopExecutor$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopExecutor$
StopExecutors$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StopExecutors$
StopMapOutputTracker - Class in org.apache.spark
StopMapOutputTracker() - Constructor for class org.apache.spark.StopMapOutputTracker
StopReceiver - Class in org.apache.spark.streaming.receiver
StopReceiver() - Constructor for class org.apache.spark.streaming.receiver.StopReceiver
stopWords() - Method in class org.apache.spark.ml.feature.StopWordsRemover: The words to be filtered out.
StopWordsRemover - Class in org.apache.spark.ml.feature: A feature transformer that filters out stop words from input.
StopWordsRemover(String) - Constructor for class org.apache.spark.ml.feature.StopWordsRemover
StopWordsRemover() - Constructor for class org.apache.spark.ml.feature.StopWordsRemover
storage() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveDirCommand
STORAGE_MEMORY() - Static method in class org.apache.spark.ui.ToolTips
storageLevel() - Method in class org.apache.spark.sql.Dataset: Get the Dataset's current storage level, or StorageLevel.NONE if not persisted.
storageLevel() - Method in class org.apache.spark.status.api.v1.RDDPartitionInfo
storageLevel() - Method in class org.apache.spark.status.api.v1.RDDStorageInfo
storageLevel() - Method in class org.apache.spark.status.LiveRDD
storageLevel() - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
storageLevel() - Method in class org.apache.spark.storage.BlockStatus
storageLevel() - Method in class org.apache.spark.storage.BlockUpdatedInfo
storageLevel() - Method in class org.apache.spark.storage.RDDInfo
StorageLevel - Class in org.apache.spark.storage: :: DeveloperApi :: Flags for controlling the storage of an RDD.
StorageLevel() - Constructor for class org.apache.spark.storage.StorageLevel
storageLevel() - Method in class org.apache.spark.streaming.receiver.Receiver
storageLevelFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
StorageLevels - Class in org.apache.spark.api.java: Expose some commonly useful storage level constants.
StorageLevels() - Constructor for class org.apache.spark.api.java.StorageLevels
storageLevelToJson(StorageLevel) - Static method in class org.apache.spark.util.JsonProtocol
StorageUtils - Class in org.apache.spark.storage: Helper methods for storage-related objects.
StorageUtils() - Constructor for class org.apache.spark.storage.StorageUtils
store(T) - Method in class org.apache.spark.streaming.receiver.Receiver: Store a single item of received data to Spark's memory.
store(ArrayBuffer<T>) - Method in class org.apache.spark.streaming.receiver.Receiver: Store an ArrayBuffer of received data as a data block into Spark's memory.
store(ArrayBuffer<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver: Store an ArrayBuffer of received data as a data block into Spark's memory.
store(Iterator<T>) - Method in class org.apache.spark.streaming.receiver.Receiver: Store an iterator of received data as a data block into Spark's memory.
store(Iterator<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver: Store an iterator of received data as a data block into Spark's memory.
store(Iterator<T>) - Method in class org.apache.spark.streaming.receiver.Receiver: Store an iterator of received data as a data block into Spark's memory.
store(Iterator<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver: Store an iterator of received data as a data block into Spark's memory.
store(ByteBuffer) - Method in class org.apache.spark.streaming.receiver.Receiver: Store the bytes of received data as a data block into Spark's memory.
store(ByteBuffer, Object) - Method in class org.apache.spark.streaming.receiver.Receiver: Store the bytes of received data as a data block into Spark's memory.
storeBlock(StreamBlockId, ReceivedBlock) - Method in interface org.apache.spark.streaming.receiver.ReceivedBlockHandler: Store a received block with the given block id and return related metadata
storeValue(T) - Method in class org.apache.spark.storage.memory.DeserializedValuesHolder
storeValue(T) - Method in class org.apache.spark.storage.memory.SerializedValuesHolder
storeValue(T) - Method in interface org.apache.spark.storage.memory.ValuesHolder
strategy() - Method in interface org.apache.spark.ml.feature.ImputerParams: The imputation strategy.
Strategy - Class in org.apache.spark.mllib.tree.configuration: Stores all the configuration options for tree construction param: algo Learning goal.
Strategy(Enumeration.Value, Impurity, int, int, int, Enumeration.Value, Map<Object, Object>, int, double, int, double, boolean, int) - Constructor for class org.apache.spark.mllib.tree.configuration.Strategy
Strategy(Enumeration.Value, Impurity, int, int, int, Map<Integer, Integer>) - Constructor for class org.apache.spark.mllib.tree.configuration.Strategy: Java-friendly constructor for Strategy
StratifiedSamplingUtils - Class in org.apache.spark.util.random: Auxiliary functions and data structures for the sampleByKey method in PairRDDFunctions.
StratifiedSamplingUtils() - Constructor for class org.apache.spark.util.random.StratifiedSamplingUtils
STREAM() - Static method in class org.apache.spark.storage.BlockId
StreamBlockId - Class in org.apache.spark.storage
StreamBlockId(int, long) - Constructor for class org.apache.spark.storage.StreamBlockId
streamId() - Method in class org.apache.spark.status.api.v1.streaming.ReceiverInfo
streamId() - Method in class org.apache.spark.storage.StreamBlockId
streamId() - Method in class org.apache.spark.streaming.receiver.Receiver: Get the unique identifier the receiver input stream that this receiver is associated with.
streamId() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
streamIdToInputInfo() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
StreamingContext - Class in org.apache.spark.streaming: Main entry point for Spark Streaming functionality.
StreamingContext(SparkContext, Duration) - Constructor for class org.apache.spark.streaming.StreamingContext: Create a StreamingContext using an existing SparkContext.
StreamingContext(SparkConf, Duration) - Constructor for class org.apache.spark.streaming.StreamingContext: Create a StreamingContext by providing the configuration necessary for a new SparkContext.
StreamingContext(String, String, Duration, String, Seq<String>, Map<String, String>) - Constructor for class org.apache.spark.streaming.StreamingContext: Create a StreamingContext by providing the details necessary for creating a new SparkContext.
StreamingContext(String, Configuration) - Constructor for class org.apache.spark.streaming.StreamingContext: Recreate a StreamingContext from a checkpoint file.
StreamingContext(String) - Constructor for class org.apache.spark.streaming.StreamingContext: Recreate a StreamingContext from a checkpoint file.
StreamingContext(String, SparkContext) - Constructor for class org.apache.spark.streaming.StreamingContext: Recreate a StreamingContext from a checkpoint file using an existing SparkContext.
StreamingContextPythonHelper - Class in org.apache.spark.streaming
StreamingContextPythonHelper() - Constructor for class org.apache.spark.streaming.StreamingContextPythonHelper
StreamingContextState - Enum in org.apache.spark.streaming: :: DeveloperApi :: Represents the state of a StreamingContext.
StreamingKMeans - Class in org.apache.spark.mllib.clustering: StreamingKMeans provides methods for configuring a streaming k-means analysis, training the model on streaming, and using the model to make predictions on streaming data.
StreamingKMeans(int, double, String) - Constructor for class org.apache.spark.mllib.clustering.StreamingKMeans
StreamingKMeans() - Constructor for class org.apache.spark.mllib.clustering.StreamingKMeans
StreamingKMeansModel - Class in org.apache.spark.mllib.clustering: StreamingKMeansModel extends MLlib's KMeansModel for streaming algorithms, so it can keep track of a continuously updated weight associated with each cluster, and also update the model by doing a single iteration of the standard k-means algorithm.
StreamingKMeansModel(Vector[], double[]) - Constructor for class org.apache.spark.mllib.clustering.StreamingKMeansModel
StreamingLinearAlgorithm<M extends GeneralizedLinearModel,A extends GeneralizedLinearAlgorithm<M>> - Class in org.apache.spark.mllib.regression: :: DeveloperApi :: StreamingLinearAlgorithm implements methods for continuously training a generalized linear model on streaming data, and using it for prediction on (possibly different) streaming data.
StreamingLinearAlgorithm() - Constructor for class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
StreamingLinearRegressionWithSGD - Class in org.apache.spark.mllib.regression: Train or predict a linear regression model on streaming data.
StreamingLinearRegressionWithSGD() - Constructor for class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD: Construct a StreamingLinearRegression object with default parameters: {stepSize: 0.1, numIterations: 50, miniBatchFraction: 1.0}.
StreamingListener - Interface in org.apache.spark.streaming.scheduler: :: DeveloperApi :: A listener interface for receiving information about an ongoing streaming computation.
StreamingListenerBatchCompleted - Class in org.apache.spark.streaming.scheduler
StreamingListenerBatchCompleted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchCompleted
StreamingListenerBatchStarted - Class in org.apache.spark.streaming.scheduler
StreamingListenerBatchStarted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchStarted
StreamingListenerBatchSubmitted - Class in org.apache.spark.streaming.scheduler
StreamingListenerBatchSubmitted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchSubmitted
StreamingListenerEvent - Interface in org.apache.spark.streaming.scheduler: :: DeveloperApi :: Base trait for events related to StreamingListener
StreamingListenerOutputOperationCompleted - Class in org.apache.spark.streaming.scheduler
StreamingListenerOutputOperationCompleted(OutputOperationInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerOutputOperationCompleted
StreamingListenerOutputOperationStarted - Class in org.apache.spark.streaming.scheduler
StreamingListenerOutputOperationStarted(OutputOperationInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerOutputOperationStarted
StreamingListenerReceiverError - Class in org.apache.spark.streaming.scheduler
StreamingListenerReceiverError(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverError
StreamingListenerReceiverStarted - Class in org.apache.spark.streaming.scheduler
StreamingListenerReceiverStarted(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStarted
StreamingListenerReceiverStopped - Class in org.apache.spark.streaming.scheduler
StreamingListenerReceiverStopped(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStopped
StreamingListenerStreamingStarted - Class in org.apache.spark.streaming.scheduler
StreamingListenerStreamingStarted(long) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerStreamingStarted
StreamingLogisticRegressionWithSGD - Class in org.apache.spark.mllib.classification: Train or predict a logistic regression model on streaming data.
StreamingLogisticRegressionWithSGD() - Constructor for class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD: Construct a StreamingLogisticRegression object with default parameters: {stepSize: 0.1, numIterations: 50, miniBatchFraction: 1.0, regParam: 0.0}.
StreamingQuery - Interface in org.apache.spark.sql.streaming: A handle to a query that is executing continuously in the background as new data arrives.
StreamingQueryException - Exception in org.apache.spark.sql.streaming: Exception that stopped a StreamingQuery.
StreamingQueryListener - Class in org.apache.spark.sql.streaming: Interface for listening to events related to StreamingQueries.
StreamingQueryListener() - Constructor for class org.apache.spark.sql.streaming.StreamingQueryListener
StreamingQueryListener.Event - Interface in org.apache.spark.sql.streaming: Base type of StreamingQueryListener events
StreamingQueryListener.QueryProgressEvent - Class in org.apache.spark.sql.streaming: Event representing any progress updates in a query.
StreamingQueryListener.QueryStartedEvent - Class in org.apache.spark.sql.streaming: Event representing the start of a query param: id A unique query id that persists across restarts.
StreamingQueryListener.QueryTerminatedEvent - Class in org.apache.spark.sql.streaming: Event representing that termination of a query.
StreamingQueryManager - Class in org.apache.spark.sql.streaming: A class to manage all the StreamingQuery active in a SparkSession.
StreamingQueryProgress - Class in org.apache.spark.sql.streaming: Information about progress made in the execution of a StreamingQuery during a trigger.
StreamingQueryStatus - Class in org.apache.spark.sql.streaming: Reports information about the instantaneous status of a streaming query.
StreamingStatistics - Class in org.apache.spark.status.api.v1.streaming
StreamingTest - Class in org.apache.spark.mllib.stat.test: Performs online 2-sample significance testing for a stream of (Boolean, Double) pairs.
StreamingTest() - Constructor for class org.apache.spark.mllib.stat.test.StreamingTest
StreamingTestMethod - Interface in org.apache.spark.mllib.stat.test: Significance testing methods for StreamingTest.
StreamInputInfo - Class in org.apache.spark.streaming.scheduler: :: DeveloperApi :: Track the information of input stream at specified batch time.
StreamInputInfo(int, long, Map<String, Object>) - Constructor for class org.apache.spark.streaming.scheduler.StreamInputInfo
streamName() - Method in class org.apache.spark.status.api.v1.streaming.ReceiverInfo
streams() - Method in class org.apache.spark.sql.SparkSession: :: Experimental :: Returns a StreamingQueryManager that allows managing all the StreamingQuerys active on this.
streams() - Method in class org.apache.spark.sql.SQLContext
StreamSinkProvider - Interface in org.apache.spark.sql.sources: ::Experimental:: Implemented by objects that can produce a streaming Sink for a specific format or system.
StreamSourceProvider - Interface in org.apache.spark.sql.sources: ::Experimental:: Implemented by objects that can produce a streaming Source for a specific format or system.
StreamWriter - Interface in org.apache.spark.sql.sources.v2.writer.streaming: A DataSourceWriter for use with structured streaming.
StreamWriteSupport - Interface in org.apache.spark.sql.sources.v2: A mix-in interface for DataSourceV2.
STRING() - Static method in class org.apache.spark.api.r.SerializationFormats
string() - Method in class org.apache.spark.sql.ColumnName: Creates a new StructField of type string.
STRING() - Static method in class org.apache.spark.sql.Encoders: An encoder for nullable string type.
StringAccumulatorParam$() - Constructor for class org.apache.spark.AccumulatorParam.StringAccumulatorParam$: Deprecated.
StringArrayParam - Class in org.apache.spark.ml.param: :: DeveloperApi :: Specialized version of Param[Array[String} for Java.
StringArrayParam(Params, String, String, Function1<String[], Object>) - Constructor for class org.apache.spark.ml.param.StringArrayParam
StringArrayParam(Params, String, String) - Constructor for class org.apache.spark.ml.param.StringArrayParam
StringContains - Class in org.apache.spark.sql.sources: A filter that evaluates to true iff the attribute evaluates to a string that contains the string value.
StringContains(String, String) - Constructor for class org.apache.spark.sql.sources.StringContains
StringEndsWith - Class in org.apache.spark.sql.sources: A filter that evaluates to true iff the attribute evaluates to a string that ends with value.
StringEndsWith(String, String) - Constructor for class org.apache.spark.sql.sources.StringEndsWith
stringHalfWidth(String) - Static method in class org.apache.spark.util.Utils: Return the number of half widths in a given string.
StringIndexer - Class in org.apache.spark.ml.feature: A label indexer that maps a string column of labels to an ML column of label indices.
StringIndexer(String) - Constructor for class org.apache.spark.ml.feature.StringIndexer
StringIndexer() - Constructor for class org.apache.spark.ml.feature.StringIndexer
StringIndexerBase - Interface in org.apache.spark.ml.feature: Base trait for StringIndexer and StringIndexerModel.
StringIndexerModel - Class in org.apache.spark.ml.feature: Model fitted by StringIndexer.
StringIndexerModel(String, String[]) - Constructor for class org.apache.spark.ml.feature.StringIndexerModel
StringIndexerModel(String[]) - Constructor for class org.apache.spark.ml.feature.StringIndexerModel
stringIndexerOrderType() - Method in interface org.apache.spark.ml.feature.RFormulaBase: Param for how to order categories of a string FEATURE column used by StringIndexer.
stringOrderType() - Method in interface org.apache.spark.ml.feature.StringIndexerBase: Param for how to order labels of string column.
StringRRDD<T> - Class in org.apache.spark.api.r: An RDD that stores R objects as Array[String].
StringRRDD(RDD<T>, byte[], String, byte[], Object[], ClassTag<T>) - Constructor for class org.apache.spark.api.r.StringRRDD
StringStartsWith - Class in org.apache.spark.sql.sources: A filter that evaluates to true iff the attribute evaluates to a string that starts with value.
StringStartsWith(String, String) - Constructor for class org.apache.spark.sql.sources.StringStartsWith
StringToColumn(StringContext) - Constructor for class org.apache.spark.sql.SQLImplicits.StringToColumn
stringToSeq(String, Function1<String, T>) - Static method in class org.apache.spark.internal.config.ConfigHelpers
stringToSeq(String) - Static method in class org.apache.spark.util.Utils
StringType - Static variable in class org.apache.spark.sql.types.DataTypes: Gets the StringType object.
StringType - Class in org.apache.spark.sql.types: The data type representing String values.
StringType() - Constructor for class org.apache.spark.sql.types.StringType
stripXSS(String) - Static method in class org.apache.spark.ui.UIUtils: Remove suspicious characters of user input to prevent Cross-Site scripting (XSS) attacks
stronglyConnectedComponents(int) - Method in class org.apache.spark.graphx.GraphOps: Compute the strongly connected component (SCC) of each vertex and return a graph with the vertex value containing the lowest vertex id in the SCC containing that vertex.
StronglyConnectedComponents - Class in org.apache.spark.graphx.lib: Strongly connected components algorithm implementation.
StronglyConnectedComponents() - Constructor for class org.apache.spark.graphx.lib.StronglyConnectedComponents
struct(Seq<StructField>) - Method in class org.apache.spark.sql.ColumnName: Creates a new StructField of type struct.
struct(StructType) - Method in class org.apache.spark.sql.ColumnName: Creates a new StructField of type struct.
struct(Column...) - Static method in class org.apache.spark.sql.functions: Creates a new struct column.
struct(String, String...) - Static method in class org.apache.spark.sql.functions: Creates a new struct column that composes multiple input columns.
struct(Seq<Column>) - Static method in class org.apache.spark.sql.functions: Creates a new struct column.
struct(String, Seq<String>) - Static method in class org.apache.spark.sql.functions: Creates a new struct column that composes multiple input columns.
StructField - Class in org.apache.spark.sql.types: A field inside a StructType.
StructField(String, DataType, boolean, Metadata) - Constructor for class org.apache.spark.sql.types.StructField
StructType - Class in org.apache.spark.sql.types: A StructType object can be constructed by
StructType(StructField[]) - Constructor for class org.apache.spark.sql.types.StructType
StructType() - Constructor for class org.apache.spark.sql.types.StructType: No-arg constructor for kryo.
stsCredentials(String, String) - Method in class org.apache.spark.streaming.kinesis.SparkAWSCredentials.Builder: Use STS to assume an IAM role for temporary session-based authentication.
stsCredentials(String, String, String) - Method in class org.apache.spark.streaming.kinesis.SparkAWSCredentials.Builder: Use STS to assume an IAM role for temporary session-based authentication.
StudentTTest - Class in org.apache.spark.mllib.stat.test: Performs Students's 2-sample t-test.
StudentTTest() - Constructor for class org.apache.spark.mllib.stat.test.StudentTTest
subgraph(Function1<EdgeTriplet<VD, ED>, Object>, Function2<Object, VD, Object>) - Method in class org.apache.spark.graphx.Graph: Restricts the graph to only the vertices and edges satisfying the predicates.
subgraph(Function1<EdgeTriplet<VD, ED>, Object>, Function2<Object, VD, Object>) - Method in class org.apache.spark.graphx.impl.GraphImpl
submissionTime() - Method in class org.apache.spark.scheduler.StageInfo: When this stage was submitted from the DAGScheduler to a TaskScheduler.
submissionTime() - Method in interface org.apache.spark.SparkStageInfo
submissionTime() - Method in class org.apache.spark.SparkStageInfoImpl
submissionTime() - Method in class org.apache.spark.status.api.v1.JobData
submissionTime() - Method in class org.apache.spark.status.api.v1.StageData
submissionTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
submitJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, Function2<Object, U, BoxedUnit>, Function0<R>) - Method in interface org.apache.spark.JobSubmitter: Submit a job for execution and return a FutureAction holding the result.
submitJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, Function2<Object, U, BoxedUnit>, Function0<R>) - Method in class org.apache.spark.SparkContext: Submit a job for execution and return a FutureJob holding the result.
submitTasks(TaskSet) - Method in interface org.apache.spark.scheduler.TaskScheduler
subModels() - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
subModels() - Method in class org.apache.spark.ml.tuning.TrainValidationSplitModel
subsamplingRate() - Method in interface org.apache.spark.ml.clustering.LDAParams: For Online optimizer only: optimizer = "online".
subsamplingRate() - Method in interface org.apache.spark.ml.tree.TreeEnsembleParams: Fraction of the training data used for learning each decision tree, in range (0, 1].
subsamplingRate() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
subsetAccuracy() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics: Returns subset accuracy (for equal sets of labels)
substituteAppId(String, String) - Static method in class org.apache.spark.util.Utils: Replaces all the {{APP_ID}} occurrences with the App Id.
substituteAppNExecIds(String, String, String) - Static method in class org.apache.spark.util.Utils: Replaces all the {{EXECUTOR_ID}} occurrences with the Executor Id and {{APP_ID}} occurrences with the App Id.
substr(Column, Column) - Method in class org.apache.spark.sql.Column: An expression that returns a substring.
substr(int, int) - Method in class org.apache.spark.sql.Column: An expression that returns a substring.
substring(Column, int, int) - Static method in class org.apache.spark.sql.functions: Substring starts at pos and is of length len when str is String type or returns the slice of byte array that starts at pos in byte and is of length len when str is Binary type
substring_index(Column, String, int) - Static method in class org.apache.spark.sql.functions: Returns the substring from string str before count occurrences of the delimiter delim.
subtract(JavaDoubleRDD) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return an RDD with the elements from this that are not in other.
subtract(JavaDoubleRDD, int) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return an RDD with the elements from this that are not in other.
subtract(JavaDoubleRDD, Partitioner) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return an RDD with the elements from this that are not in other.
subtract(JavaPairRDD<K, V>) - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD with the elements from this that are not in other.
subtract(JavaPairRDD<K, V>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD with the elements from this that are not in other.
subtract(JavaPairRDD<K, V>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD with the elements from this that are not in other.
subtract(JavaRDD<T>) - Method in class org.apache.spark.api.java.JavaRDD: Return an RDD with the elements from this that are not in other.
subtract(JavaRDD<T>, int) - Method in class org.apache.spark.api.java.JavaRDD: Return an RDD with the elements from this that are not in other.
subtract(JavaRDD<T>, Partitioner) - Method in class org.apache.spark.api.java.JavaRDD: Return an RDD with the elements from this that are not in other.
subtract(BlockMatrix) - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix: Subtracts the given block matrix other from this block matrix: this - other.
subtract(RDD<T>) - Method in class org.apache.spark.rdd.RDD: Return an RDD with the elements from this that are not in other.
subtract(RDD<T>, int) - Method in class org.apache.spark.rdd.RDD: Return an RDD with the elements from this that are not in other.
subtract(RDD<T>, Partitioner, Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Return an RDD with the elements from this that are not in other.
subtract(long, long) - Static method in class org.apache.spark.streaming.util.RawTextHelper
subtractByKey(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD with the pairs from this whose keys are not in other.
subtractByKey(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD with the pairs from this whose keys are not in other.
subtractByKey(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD with the pairs from this whose keys are not in other.
subtractByKey(RDD<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return an RDD with the pairs from this whose keys are not in other.
subtractByKey(RDD<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return an RDD with the pairs from this whose keys are not in other.
subtractByKey(RDD<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return an RDD with the pairs from this whose keys are not in other.
subtractMetrics(TaskMetrics, TaskMetrics) - Static method in class org.apache.spark.status.LiveEntityHelpers: Subtract m2 values from m1.
succeededTasks() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
succeededTasks() - Method in class org.apache.spark.status.LiveExecutorStageSummary
success(T) - Static method in class org.apache.spark.ml.feature.RFormulaParser
Success - Class in org.apache.spark: :: DeveloperApi :: Task succeeded.
Success() - Constructor for class org.apache.spark.Success
successful() - Method in class org.apache.spark.scheduler.TaskInfo
sum() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Add up the elements in this RDD.
Sum() - Static method in class org.apache.spark.mllib.tree.configuration.EnsembleCombiningStrategy
sum() - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Add up the elements in this RDD.
sum(MapFunction<T, Double>) - Static method in class org.apache.spark.sql.expressions.javalang.typed: Sum aggregate function for floating point (double) type.
sum(Function1<IN, Object>) - Static method in class org.apache.spark.sql.expressions.scalalang.typed: Sum aggregate function for floating point (double) type.
sum(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the sum of all values in the expression.
sum(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the sum of all values in the given column.
sum(String...) - Method in class org.apache.spark.sql.RelationalGroupedDataset: Compute the sum for each numeric columns for each group.
sum(Seq<String>) - Method in class org.apache.spark.sql.RelationalGroupedDataset: Compute the sum for each numeric columns for each group.
sum() - Method in class org.apache.spark.util.DoubleAccumulator: Returns the sum of elements added to the accumulator.
sum() - Method in class org.apache.spark.util.LongAccumulator: Returns the sum of elements added to the accumulator.
sum() - Method in class org.apache.spark.util.StatCounter
sumApprox(long, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Approximate operation to return the sum within a timeout.
sumApprox(long) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Approximate operation to return the sum within a timeout.
sumApprox(long, double) - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Approximate operation to return the sum within a timeout.
sumDistinct(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the sum of distinct values in the expression.
sumDistinct(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the sum of distinct values in the expression.
sumLong(MapFunction<T, Long>) - Static method in class org.apache.spark.sql.expressions.javalang.typed: Sum aggregate function for integral (long, i.e.
sumLong(Function1<IN, Object>) - Static method in class org.apache.spark.sql.expressions.scalalang.typed: Sum aggregate function for integral (long, i.e.
Summarizer - Class in org.apache.spark.ml.stat: Tools for vectorized statistics on MLlib Vectors.
Summarizer() - Constructor for class org.apache.spark.ml.stat.Summarizer
summary() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel: Gets summary of model on training set.
summary() - Method in class org.apache.spark.ml.clustering.BisectingKMeansModel: Gets summary of model on training set.
summary() - Method in class org.apache.spark.ml.clustering.GaussianMixtureModel: Gets summary of model on training set.
summary() - Method in class org.apache.spark.ml.clustering.KMeansModel: Gets summary of model on training set.
summary() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionModel: Gets R-like summary of model on training set.
summary() - Method in class org.apache.spark.ml.regression.LinearRegressionModel: Gets summary (e.g.
summary(Column, Column) - Method in class org.apache.spark.ml.stat.SummaryBuilder: Returns an aggregate object that contains the summary of the column with the requested metrics.
summary(Column) - Method in class org.apache.spark.ml.stat.SummaryBuilder
summary(String...) - Method in class org.apache.spark.sql.Dataset: Computes specified statistics for numeric and string columns.
summary(Seq<String>) - Method in class org.apache.spark.sql.Dataset: Computes specified statistics for numeric and string columns.
SummaryBuilder - Class in org.apache.spark.ml.stat: A builder object that provides summary statistics about a given column.
SummaryBuilder() - Constructor for class org.apache.spark.ml.stat.SummaryBuilder
supportDataType(DataType, boolean) - Method in class org.apache.spark.sql.hive.orc.OrcFileFormat
supportedFeatureSubsetStrategies() - Static method in class org.apache.spark.ml.classification.RandomForestClassifier: Accessor for supported featureSubsetStrategy settings: auto, all, onethird, sqrt, log2
supportedFeatureSubsetStrategies() - Static method in class org.apache.spark.ml.regression.RandomForestRegressor: Accessor for supported featureSubsetStrategy settings: auto, all, onethird, sqrt, log2
supportedFeatureSubsetStrategies() - Static method in class org.apache.spark.mllib.tree.RandomForest: List of supported feature subset sampling strategies.
supportedImpurities() - Static method in class org.apache.spark.ml.classification.DecisionTreeClassifier: Accessor for supported impurities: entropy, gini
supportedImpurities() - Static method in class org.apache.spark.ml.classification.RandomForestClassifier: Accessor for supported impurity settings: entropy, gini
supportedImpurities() - Static method in class org.apache.spark.ml.regression.DecisionTreeRegressor: Accessor for supported impurities: variance
supportedImpurities() - Static method in class org.apache.spark.ml.regression.RandomForestRegressor: Accessor for supported impurity settings: variance
supportedLossTypes() - Static method in class org.apache.spark.ml.classification.GBTClassifier: Accessor for supported loss settings: logistic
supportedLossTypes() - Static method in class org.apache.spark.ml.regression.GBTRegressor: Accessor for supported loss settings: squared (L2), absolute (L1)
supportedOptimizers() - Method in interface org.apache.spark.ml.clustering.LDAParams: Supported values for Param optimizer.
supportedSelectorTypes() - Static method in class org.apache.spark.mllib.feature.ChiSqSelector: Set of selector types that ChiSqSelector supports.
SupportsPushDownFilters - Interface in org.apache.spark.sql.sources.v2.reader: A mix-in interface for DataSourceReader.
SupportsPushDownRequiredColumns - Interface in org.apache.spark.sql.sources.v2.reader: A mix-in interface for DataSourceReader.
SupportsReportPartitioning - Interface in org.apache.spark.sql.sources.v2.reader: A mix in interface for DataSourceReader.
SupportsReportStatistics - Interface in org.apache.spark.sql.sources.v2.reader: A mix in interface for DataSourceReader.
SupportsScanColumnarBatch - Interface in org.apache.spark.sql.sources.v2.reader: A mix-in interface for DataSourceReader.
surrogateDF() - Method in class org.apache.spark.ml.feature.ImputerModel
SVDPlusPlus - Class in org.apache.spark.graphx.lib: Implementation of SVD++ algorithm.
SVDPlusPlus() - Constructor for class org.apache.spark.graphx.lib.SVDPlusPlus
SVDPlusPlus.Conf - Class in org.apache.spark.graphx.lib: Configuration parameters for SVDPlusPlus.
SVMDataGenerator - Class in org.apache.spark.mllib.util: :: DeveloperApi :: Generate sample data used for SVM.
SVMDataGenerator() - Constructor for class org.apache.spark.mllib.util.SVMDataGenerator
SVMModel - Class in org.apache.spark.mllib.classification: Model for Support Vector Machines (SVMs).
SVMModel(Vector, double) - Constructor for class org.apache.spark.mllib.classification.SVMModel
SVMWithSGD - Class in org.apache.spark.mllib.classification: Train a Support Vector Machine (SVM) using Stochastic Gradient Descent.
SVMWithSGD() - Constructor for class org.apache.spark.mllib.classification.SVMWithSGD: Construct a SVM object with default parameters: {stepSize: 1.0, numIterations: 100, regParm: 0.01, miniBatchFraction: 1.0}.
symbolToColumn(Symbol) - Method in class org.apache.spark.sql.SQLImplicits: An implicit conversion that turns a Scala Symbol into a Column.
symlink(File, File) - Static method in class org.apache.spark.util.Utils: Creates a symlink.
symmetricEigs(Function1<DenseVector<Object>, DenseVector<Object>>, int, int, double, int) - Static method in class org.apache.spark.mllib.linalg.EigenValueDecomposition: Compute the leading k eigenvalues and eigenvectors on a symmetric square matrix using ARPACK.
syr(double, Vector, DenseMatrix) - Static method in class org.apache.spark.ml.linalg.BLAS: A := alpha * x * x^T^ + A
syr(double, Vector, DenseMatrix) - Static method in class org.apache.spark.mllib.linalg.BLAS: A := alpha * x * x^T^ + A
SYSTEM_DEFAULT() - Static method in class org.apache.spark.sql.types.DecimalType
systemProperties() - Method in class org.apache.spark.status.api.v1.ApplicationEnvironmentInfo

T

t() - Method in class org.apache.spark.SerializableWritable
Table - Class in org.apache.spark.sql.catalog: A table in Spark, as returned by the listTables method in Catalog.
Table(String, String, String, String, boolean) - Constructor for class org.apache.spark.sql.catalog.Table
table(String) - Method in class org.apache.spark.sql.DataFrameReader: Returns the specified table as a DataFrame.
table() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
table(String) - Method in class org.apache.spark.sql.SparkSession: Returns the specified table/view as a DataFrame.
table(String) - Method in class org.apache.spark.sql.SQLContext
table(int) - Method in interface org.apache.spark.ui.PagedTable
TABLE_CLASS_NOT_STRIPED() - Static method in class org.apache.spark.ui.UIUtils
TABLE_CLASS_STRIPED() - Static method in class org.apache.spark.ui.UIUtils
TABLE_CLASS_STRIPED_SORTABLE() - Static method in class org.apache.spark.ui.UIUtils
TABLE_KEY - Static variable in class org.apache.spark.sql.sources.v2.DataSourceOptions: The option key for table name.
tableCssClass() - Method in interface org.apache.spark.ui.PagedTable
tableDesc() - Method in class org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand
tableExists(String) - Method in class org.apache.spark.sql.catalog.Catalog: Check if the table or view with the specified name exists.
tableExists(String, String) - Method in class org.apache.spark.sql.catalog.Catalog: Check if the table or view with the specified name exists in the specified database.
tableExists(String, String) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Return whether a table/view with the specified name exists.
tableId() - Method in interface org.apache.spark.ui.PagedTable
tableName() - Method in class org.apache.spark.sql.sources.v2.DataSourceOptions: Returns the value of the table name option.
tableNames() - Method in class org.apache.spark.sql.SQLContext
tableNames(String) - Method in class org.apache.spark.sql.SQLContext
TableReader - Interface in org.apache.spark.sql.hive: A trait for subclasses that handle table scans.
tables() - Method in class org.apache.spark.sql.SQLContext
tables(String) - Method in class org.apache.spark.sql.SQLContext
TableScan - Interface in org.apache.spark.sql.sources: A BaseRelation that can produce all of its tuples as an RDD of Row objects.
tableType() - Method in class org.apache.spark.sql.catalog.Table
take(int) - Method in interface org.apache.spark.api.java.JavaRDDLike: Take the first num elements of the RDD.
take(int) - Method in class org.apache.spark.rdd.RDD: Take the first num elements of the RDD.
take(int) - Method in class org.apache.spark.sql.Dataset: Returns the first n rows in the Dataset.
takeAsList(int) - Method in class org.apache.spark.sql.Dataset: Returns the first n rows in the Dataset as a list.
takeAsync(int) - Method in interface org.apache.spark.api.java.JavaRDDLike: The asynchronous version of the take action, which returns a future for retrieving the first num elements of this RDD.
takeAsync(int) - Method in class org.apache.spark.rdd.AsyncRDDActions: Returns a future for retrieving the first num elements of the RDD.
takeOrdered(int, Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Returns the first k (smallest) elements from this RDD as defined by the specified Comparator[T] and maintains the order.
takeOrdered(int) - Method in interface org.apache.spark.api.java.JavaRDDLike: Returns the first k (smallest) elements from this RDD using the natural ordering for T while maintain the order.
takeOrdered(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Returns the first k (smallest) elements from this RDD as defined by the specified implicit Ordering[T] and maintains the ordering.
takeSample(boolean, int) - Method in interface org.apache.spark.api.java.JavaRDDLike
takeSample(boolean, int, long) - Method in interface org.apache.spark.api.java.JavaRDDLike
takeSample(boolean, int, long) - Method in class org.apache.spark.rdd.RDD: Return a fixed-size sampled subset of this RDD in an array
tallSkinnyQR(boolean) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Compute QR decomposition for RowMatrix.
tan(Column) - Static method in class org.apache.spark.sql.functions
tan(String) - Static method in class org.apache.spark.sql.functions
tanh(Column) - Static method in class org.apache.spark.sql.functions
tanh(String) - Static method in class org.apache.spark.sql.functions
targetStorageLevel() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
targetStorageLevel() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
task() - Method in class org.apache.spark.CleanupTaskWeakReference
TASK_DESERIALIZATION_TIME() - Static method in class org.apache.spark.ui.jobs.TaskDetailsClassNames
TASK_DESERIALIZATION_TIME() - Static method in class org.apache.spark.ui.ToolTips
TASK_INDEX() - Static method in class org.apache.spark.status.TaskIndexNames
TASK_TIME() - Static method in class org.apache.spark.ui.ToolTips
taskAttemptId() - Method in class org.apache.spark.BarrierTaskContext
taskAttemptId() - Method in class org.apache.spark.TaskContext: An ID that is unique to this task attempt (within the same SparkContext, no two task attempts will share the same attempt ID).
TaskCommitDenied - Class in org.apache.spark: :: DeveloperApi :: Task requested the driver to commit, but was denied.
TaskCommitDenied(int, int, int) - Constructor for class org.apache.spark.TaskCommitDenied
TaskCommitMessage(Object) - Constructor for class org.apache.spark.internal.io.FileCommitProtocol.TaskCommitMessage
TaskCompletionListener - Interface in org.apache.spark.util: :: DeveloperApi ::
TaskContext - Class in org.apache.spark: Contextual information about a task which can be read or mutated during execution.
TaskContext() - Constructor for class org.apache.spark.TaskContext
TaskData - Class in org.apache.spark.status.api.v1
TaskDetailsClassNames - Class in org.apache.spark.ui.jobs: Names of the CSS classes corresponding to each type of task detail.
TaskDetailsClassNames() - Constructor for class org.apache.spark.ui.jobs.TaskDetailsClassNames
taskEndFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
TaskEndReason - Interface in org.apache.spark: :: DeveloperApi :: Various possible reasons why a task ended.
taskEndReasonFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
taskEndReasonToJson(TaskEndReason) - Static method in class org.apache.spark.util.JsonProtocol
taskEndToJson(SparkListenerTaskEnd) - Static method in class org.apache.spark.util.JsonProtocol
TaskFailedReason - Interface in org.apache.spark: :: DeveloperApi :: Various possible reasons why a task failed.
TaskFailureListener - Interface in org.apache.spark.util: :: DeveloperApi ::
taskFailures() - Method in class org.apache.spark.scheduler.SparkListenerExecutorBlacklisted
taskFailures() - Method in class org.apache.spark.scheduler.SparkListenerExecutorBlacklistedForStage
taskGettingResultFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
taskGettingResultToJson(SparkListenerTaskGettingResult) - Static method in class org.apache.spark.util.JsonProtocol
taskId() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.KillTask
taskId() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate
taskId() - Method in class org.apache.spark.scheduler.local.KillTask
taskId() - Method in class org.apache.spark.scheduler.local.StatusUpdate
taskId() - Method in class org.apache.spark.scheduler.TaskInfo
taskId() - Method in class org.apache.spark.status.api.v1.TaskData
taskId() - Method in class org.apache.spark.storage.TaskResultBlockId
TaskIndexNames - Class in org.apache.spark.status: Tasks have a lot of indices that are used in a few different places.
TaskIndexNames() - Constructor for class org.apache.spark.status.TaskIndexNames
taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskGettingResult
taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskStart
TaskInfo - Class in org.apache.spark.scheduler: :: DeveloperApi :: Information about a running task attempt inside a TaskSet.
TaskInfo(long, int, int, long, String, String, Enumeration.Value, boolean) - Constructor for class org.apache.spark.scheduler.TaskInfo
taskInfoFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
taskInfoToJson(TaskInfo) - Static method in class org.apache.spark.util.JsonProtocol
TaskKilled - Class in org.apache.spark: :: DeveloperApi :: Task was killed intentionally and needs to be rescheduled.
TaskKilled(String, Seq<AccumulableInfo>, Seq<AccumulatorV2<?, ?>>) - Constructor for class org.apache.spark.TaskKilled
TaskKilledException - Exception in org.apache.spark: :: DeveloperApi :: Exception thrown when a task is explicitly killed (i.e., task failure is expected).
TaskKilledException(String) - Constructor for exception org.apache.spark.TaskKilledException
TaskKilledException() - Constructor for exception org.apache.spark.TaskKilledException
taskLocality() - Method in class org.apache.spark.scheduler.TaskInfo
TaskLocality - Class in org.apache.spark.scheduler
TaskLocality() - Constructor for class org.apache.spark.scheduler.TaskLocality
taskLocality() - Method in class org.apache.spark.status.api.v1.TaskData
TaskLocation - Interface in org.apache.spark.scheduler: A location where a task should run.
TaskMetricDistributions - Class in org.apache.spark.status.api.v1
taskMetrics() - Method in class org.apache.spark.BarrierTaskContext
taskMetrics() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
taskMetrics() - Method in class org.apache.spark.scheduler.StageInfo
taskMetrics() - Method in class org.apache.spark.status.api.v1.TaskData
TaskMetrics - Class in org.apache.spark.status.api.v1
taskMetrics() - Method in class org.apache.spark.TaskContext
taskMetricsFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
taskMetricsToJson(TaskMetrics) - Static method in class org.apache.spark.util.JsonProtocol
TaskResult<T> - Interface in org.apache.spark.scheduler
TASKRESULT() - Static method in class org.apache.spark.storage.BlockId
TaskResultBlockId - Class in org.apache.spark.storage
TaskResultBlockId(long) - Constructor for class org.apache.spark.storage.TaskResultBlockId
TaskResultLost - Class in org.apache.spark: :: DeveloperApi :: The task finished successfully, but the result was lost from the executor's block manager before it was fetched.
TaskResultLost() - Constructor for class org.apache.spark.TaskResultLost
tasks() - Method in class org.apache.spark.status.api.v1.StageData
TaskScheduler - Interface in org.apache.spark.scheduler: Low-level task scheduler interface, currently implemented exclusively by TaskSchedulerImpl.
TaskSchedulerIsSet - Class in org.apache.spark: An event that SparkContext uses to notify HeartbeatReceiver that SparkContext.taskScheduler is created.
TaskSchedulerIsSet() - Constructor for class org.apache.spark.TaskSchedulerIsSet
TaskSorting - Enum in org.apache.spark.status.api.v1
taskStartFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
taskStartToJson(SparkListenerTaskStart) - Static method in class org.apache.spark.util.JsonProtocol
TaskState - Class in org.apache.spark
TaskState() - Constructor for class org.apache.spark.TaskState
taskSucceeded(int, Object) - Method in interface org.apache.spark.scheduler.JobListener
taskTime() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
taskTime() - Method in class org.apache.spark.status.LiveExecutorStageSummary
taskType() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
TEMP_DIR_SHUTDOWN_PRIORITY() - Static method in class org.apache.spark.util.ShutdownHookManager: The shutdown priority of temp directory must be lower than the SparkContext shutdown priority.
TEMP_LOCAL() - Static method in class org.apache.spark.storage.BlockId
TEMP_SHUFFLE() - Static method in class org.apache.spark.storage.BlockId
tempFileWith(File) - Static method in class org.apache.spark.util.Utils: Returns a path of temporary file which is in the same directory with path.
TeradataDialect - Class in org.apache.spark.sql.jdbc
TeradataDialect() - Constructor for class org.apache.spark.sql.jdbc.TeradataDialect
Term - Interface in org.apache.spark.ml.feature: R formula terms.
terminateProcess(Process, long) - Static method in class org.apache.spark.util.Utils: Terminates a process waiting for at most the specified duration.
test(Dataset<Row>, String, String) - Static method in class org.apache.spark.ml.stat.ChiSquareTest: Conduct Pearson's independence test for every feature against the label.
test(Dataset<?>, String, String, double...) - Static method in class org.apache.spark.ml.stat.KolmogorovSmirnovTest: Convenience function to conduct a one-sample, two-sided Kolmogorov-Smirnov test for probability distribution equality.
test(Dataset<?>, String, Function1<Object, Object>) - Static method in class org.apache.spark.ml.stat.KolmogorovSmirnovTest
test(Dataset<?>, String, Function<Double, Double>) - Static method in class org.apache.spark.ml.stat.KolmogorovSmirnovTest
test(Dataset<?>, String, String, Seq<Object>) - Static method in class org.apache.spark.ml.stat.KolmogorovSmirnovTest
TEST() - Static method in class org.apache.spark.storage.BlockId
TEST_ACCUM() - Static method in class org.apache.spark.InternalAccumulator
testCommandAvailable(String) - Static method in class org.apache.spark.TestUtils: Test if a command is available.
testOneSample(RDD<Object>, String, double...) - Static method in class org.apache.spark.mllib.stat.test.KolmogorovSmirnovTest: A convenience function that allows running the KS test for 1 set of sample data against a named distribution
testOneSample(RDD<Object>, Function1<Object, Object>) - Static method in class org.apache.spark.mllib.stat.test.KolmogorovSmirnovTest
testOneSample(RDD<Object>, RealDistribution) - Static method in class org.apache.spark.mllib.stat.test.KolmogorovSmirnovTest
testOneSample(RDD<Object>, String, Seq<Object>) - Static method in class org.apache.spark.mllib.stat.test.KolmogorovSmirnovTest
TestResult<DF> - Interface in org.apache.spark.mllib.stat.test: Trait for hypothesis test results.
TestUtils - Class in org.apache.spark: Utilities for tests.
TestUtils() - Constructor for class org.apache.spark.TestUtils
text(String...) - Method in class org.apache.spark.sql.DataFrameReader: Loads text files and returns a DataFrame whose schema starts with a string column named "value", and followed by partitioned columns if there are any.
text(String) - Method in class org.apache.spark.sql.DataFrameReader: Loads text files and returns a DataFrame whose schema starts with a string column named "value", and followed by partitioned columns if there are any.
text(Seq<String>) - Method in class org.apache.spark.sql.DataFrameReader: Loads text files and returns a DataFrame whose schema starts with a string column named "value", and followed by partitioned columns if there are any.
text(String) - Method in class org.apache.spark.sql.DataFrameWriter: Saves the content of the DataFrame in a text file at the specified path.
text(String) - Method in class org.apache.spark.sql.streaming.DataStreamReader: Loads text files and returns a DataFrame whose schema starts with a string column named "value", and followed by partitioned columns if there are any.
textFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Read a text file from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI, and return it as an RDD of Strings.
textFile(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Read a text file from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI, and return it as an RDD of Strings.
textFile(String, int) - Method in class org.apache.spark.SparkContext: Read a text file from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI, and return it as an RDD of Strings.
textFile(String...) - Method in class org.apache.spark.sql.DataFrameReader: Loads text files and returns a Dataset of String.
textFile(String) - Method in class org.apache.spark.sql.DataFrameReader: Loads text files and returns a Dataset of String.
textFile(Seq<String>) - Method in class org.apache.spark.sql.DataFrameReader: Loads text files and returns a Dataset of String.
textFile(String) - Method in class org.apache.spark.sql.streaming.DataStreamReader: Loads text file(s) and returns a Dataset of String.
textFileStream(String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream that monitors a Hadoop-compatible filesystem for new files and reads them as text files (using key as LongWritable, value as Text and input format as TextInputFormat).
textFileStream(String) - Method in class org.apache.spark.streaming.StreamingContext: Create an input stream that monitors a Hadoop-compatible filesystem for new files and reads them as text files (using key as LongWritable, value as Text and input format as TextInputFormat).
textResponderToServlet(Function1<HttpServletRequest, String>) - Static method in class org.apache.spark.ui.JettyUtils
theta() - Method in class org.apache.spark.ml.classification.NaiveBayesModel
theta() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV1_0$.Data
theta() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV2_0$.Data
theta() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
thisClassName() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV1_0$: Hard-code class name string in case it changes in the future
thisClassName() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV2_0$: Hard-code class name string in case it changes in the future
thisClassName() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$
thisFormatVersion() - Method in class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$
thisFormatVersion() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV1_0$
thisFormatVersion() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV2_0$
thisFormatVersion() - Method in class org.apache.spark.mllib.regression.impl.GLMRegressionModel.SaveLoadV1_0$
thisFormatVersion() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$
threadId() - Method in class org.apache.spark.status.api.v1.ThreadStackTrace
threadName() - Method in class org.apache.spark.status.api.v1.ThreadStackTrace
ThreadSafeRpcEndpoint - Interface in org.apache.spark.rpc: A trait that requires RpcEnv thread-safely sending messages to it.
ThreadStackTrace - Class in org.apache.spark.status.api.v1
ThreadStackTrace(long, String, Thread.State, StackTrace, Option<Object>, String, Seq<String>) - Constructor for class org.apache.spark.status.api.v1.ThreadStackTrace
threadState() - Method in class org.apache.spark.status.api.v1.ThreadStackTrace
ThreadUtils - Class in org.apache.spark.util
ThreadUtils() - Constructor for class org.apache.spark.util.ThreadUtils
threshold() - Method in interface org.apache.spark.ml.classification.LinearSVCParams: Param for threshold in binary classification prediction.
threshold() - Method in class org.apache.spark.ml.feature.Binarizer: Param for threshold used to binarize continuous features.
threshold() - Method in interface org.apache.spark.ml.param.shared.HasThreshold: Param for threshold in binary classification prediction, in range [0, 1].
threshold() - Method in class org.apache.spark.ml.tree.ContinuousSplit
threshold() - Method in class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$.Data
threshold() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData
threshold() - Method in class org.apache.spark.mllib.tree.model.Split
thresholds() - Method in interface org.apache.spark.ml.param.shared.HasThresholds: Param for Thresholds in multi-class classification to adjust the probability of predicting each class.
thresholds() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Returns thresholds in descending order.
throwBalls(int, RDD<?>, double, DefaultPartitionCoalescer.PartitionLocations) - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer
time() - Method in class org.apache.spark.scheduler.SparkListenerApplicationEnd
time() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
time() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
time() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerRemoved
time() - Method in class org.apache.spark.scheduler.SparkListenerExecutorAdded
time() - Method in class org.apache.spark.scheduler.SparkListenerExecutorBlacklisted
time() - Method in class org.apache.spark.scheduler.SparkListenerExecutorBlacklistedForStage
time() - Method in class org.apache.spark.scheduler.SparkListenerExecutorRemoved
time() - Method in class org.apache.spark.scheduler.SparkListenerExecutorUnblacklisted
time() - Method in class org.apache.spark.scheduler.SparkListenerJobEnd
time() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
time() - Method in class org.apache.spark.scheduler.SparkListenerNodeBlacklisted
time() - Method in class org.apache.spark.scheduler.SparkListenerNodeBlacklistedForStage
time() - Method in class org.apache.spark.scheduler.SparkListenerNodeUnblacklisted
time(Function0<T>) - Method in class org.apache.spark.sql.SparkSession: Executes some code block and prints to stdout the time taken to execute the block.
time() - Method in exception org.apache.spark.sql.streaming.StreamingQueryException: Time when the exception occurred
time() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerStreamingStarted
Time - Class in org.apache.spark.streaming: This is a simple class that represents an absolute instant of time.
Time(long) - Constructor for class org.apache.spark.streaming.Time
timeFromString(String, TimeUnit) - Static method in class org.apache.spark.internal.config.ConfigHelpers
timeIt(int, Function0<BoxedUnit>, Option<Function0<BoxedUnit>>) - Static method in class org.apache.spark.util.Utils: Timing method based on iterations that permit JVM JIT optimization.
timeout(Duration) - Method in class org.apache.spark.streaming.StateSpec: Set the duration after which the state of an idle key will be removed.
TIMER() - Static method in class org.apache.spark.metrics.sink.StatsdMetricType
times(Decimal, Decimal) - Method in interface org.apache.spark.sql.types.Decimal.DecimalIsConflicted
times(int) - Method in class org.apache.spark.streaming.Duration
times(int, Function0<BoxedUnit>) - Static method in class org.apache.spark.util.Utils: Method executed for repeating a task for side effects.
timestamp() - Method in class org.apache.spark.sql.ColumnName: Creates a new StructField of type timestamp.
TIMESTAMP() - Static method in class org.apache.spark.sql.Encoders: An encoder for nullable timestamp type.
timestamp() - Method in class org.apache.spark.sql.streaming.StreamingQueryProgress
TimestampType - Static variable in class org.apache.spark.sql.types.DataTypes: Gets the TimestampType object.
TimestampType - Class in org.apache.spark.sql.types: The data type representing java.sql.Timestamp values.
TimestampType() - Constructor for class org.apache.spark.sql.types.TimestampType
timeStringAsMs(String) - Static method in class org.apache.spark.util.Utils: Convert a time parameter such as (50s, 100ms, or 250us) to milliseconds for internal use.
timeStringAsSeconds(String) - Static method in class org.apache.spark.util.Utils: Convert a time parameter such as (50s, 100ms, or 250us) to seconds for internal use.
timeTakenMs(Function0<T>) - Static method in class org.apache.spark.util.Utils: Records the duration of running `body`.
timeToString(long, TimeUnit) - Static method in class org.apache.spark.internal.config.ConfigHelpers
TimeTrackingOutputStream - Class in org.apache.spark.storage: Intercepts write calls and tracks total time spent writing in order to update shuffle write metrics.
TimeTrackingOutputStream(ShuffleWriteMetrics, OutputStream) - Constructor for class org.apache.spark.storage.TimeTrackingOutputStream
timeUnit() - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
TIMING_DATA() - Static method in class org.apache.spark.api.r.SpecialLengths
to(Time, Duration) - Method in class org.apache.spark.streaming.Time
to_date(Column) - Static method in class org.apache.spark.sql.functions: Converts the column into DateType by casting rules to DateType.
to_date(Column, String) - Static method in class org.apache.spark.sql.functions: Converts the column into a DateType with a specified format
to_json(Column, Map<String, String>) - Static method in class org.apache.spark.sql.functions: (Scala-specific) Converts a column containing a StructType, ArrayType or a MapType into a JSON string with the specified schema.
to_json(Column, Map<String, String>) - Static method in class org.apache.spark.sql.functions: (Java-specific) Converts a column containing a StructType, ArrayType or a MapType into a JSON string with the specified schema.
to_json(Column) - Static method in class org.apache.spark.sql.functions: Converts a column containing a StructType, ArrayType or a MapType into a JSON string with the specified schema.
to_timestamp(Column) - Static method in class org.apache.spark.sql.functions: Converts to a timestamp by casting rules to TimestampType.
to_timestamp(Column, String) - Static method in class org.apache.spark.sql.functions: Converts time string with the given pattern to timestamp.
to_utc_timestamp(Column, String) - Static method in class org.apache.spark.sql.functions: Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in the given time zone, and renders that time as a timestamp in UTC.
to_utc_timestamp(Column, Column) - Static method in class org.apache.spark.sql.functions: Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in the given time zone, and renders that time as a timestamp in UTC.
toApacheCommonsStats(StatCounter) - Method in interface org.apache.spark.mllib.stat.test.StreamingTestMethod: Implicit adapter to convert between streaming summary statistics type and the type required by the t-testing libraries.
toApi() - Method in class org.apache.spark.status.LiveRDDDistribution
toApi() - Method in class org.apache.spark.status.LiveStage
toArray() - Method in class org.apache.spark.input.PortableDataStream: Read the file as a byte array
toArray() - Method in class org.apache.spark.ml.linalg.DenseVector
toArray() - Method in interface org.apache.spark.ml.linalg.Matrix: Converts to a dense array in column major.
toArray() - Method in class org.apache.spark.ml.linalg.SparseVector
toArray() - Method in interface org.apache.spark.ml.linalg.Vector: Converts the instance to a double array.
toArray() - Method in class org.apache.spark.mllib.linalg.DenseVector
toArray() - Method in interface org.apache.spark.mllib.linalg.Matrix: Converts to a dense array in column major.
toArray() - Method in class org.apache.spark.mllib.linalg.SparseVector
toArray() - Method in interface org.apache.spark.mllib.linalg.Vector: Converts the instance to a double array.
toBigDecimal() - Method in class org.apache.spark.sql.types.Decimal
toBlockMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix: Converts to BlockMatrix.
toBlockMatrix(int, int) - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix: Converts to BlockMatrix.
toBlockMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix: Converts to BlockMatrix.
toBlockMatrix(int, int) - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix: Converts to BlockMatrix.
toBoolean(String, String) - Static method in class org.apache.spark.internal.config.ConfigHelpers
toBooleanArray() - Method in class org.apache.spark.sql.vectorized.ColumnarArray
toBreeze() - Method in interface org.apache.spark.mllib.linalg.distributed.DistributedMatrix: Collects data and assembles a local dense breeze matrix (for test only).
toByte() - Method in class org.apache.spark.sql.types.Decimal
toByteArray() - Method in class org.apache.spark.sql.vectorized.ColumnarArray
toByteArray() - Method in class org.apache.spark.util.sketch.CountMinSketch: Serializes this CountMinSketch and returns the serialized form.
toByteBuffer() - Method in interface org.apache.spark.storage.BlockData
toByteBuffer() - Method in class org.apache.spark.storage.DiskBlockData
toCatalystDecimal(HiveDecimalObjectInspector, Object) - Static method in class org.apache.spark.sql.hive.HiveShim
toChunkedByteBuffer(Function1<Object, ByteBuffer>) - Method in interface org.apache.spark.storage.BlockData
toChunkedByteBuffer(Function1<Object, ByteBuffer>) - Method in class org.apache.spark.storage.DiskBlockData
toColumn() - Method in class org.apache.spark.sql.expressions.Aggregator: Returns this Aggregator as a TypedColumn that can be used in Dataset.
toCoordinateMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix: Converts to CoordinateMatrix.
toCoordinateMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix: Converts this matrix to a CoordinateMatrix.
toCryptoConf(SparkConf) - Static method in class org.apache.spark.security.CryptoStreamUtils
toDDL() - Method in class org.apache.spark.sql.types.StructField: Returns a string containing a schema in DDL format.
toDDL() - Method in class org.apache.spark.sql.types.StructType: Returns a string containing a schema in DDL format.
toDebugString() - Method in interface org.apache.spark.api.java.JavaRDDLike: A description of this RDD and its recursive dependencies for debugging.
toDebugString() - Method in interface org.apache.spark.ml.tree.DecisionTreeModel: Full description of model
toDebugString() - Method in interface org.apache.spark.ml.tree.TreeEnsembleModel: Full description of model
toDebugString() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel: Print the full model to a string.
toDebugString() - Method in class org.apache.spark.rdd.RDD: A description of this RDD and its recursive dependencies for debugging.
toDebugString() - Method in class org.apache.spark.SparkConf: Return a string listing all keys and values, one per line.
toDebugString() - Method in class org.apache.spark.sql.types.Decimal
toDegrees(Column) - Static method in class org.apache.spark.sql.functions: Deprecated.
Use degrees. Since 2.1.0.
toDegrees(String) - Static method in class org.apache.spark.sql.functions: Deprecated.
Use degrees. Since 2.1.0.
toDense() - Method in interface org.apache.spark.ml.linalg.Matrix: Converts this matrix to a dense matrix while maintaining the layout of the current matrix.
toDense() - Method in interface org.apache.spark.ml.linalg.Vector: Converts this vector to a dense vector.
toDense() - Method in class org.apache.spark.mllib.linalg.SparseMatrix: Generate a DenseMatrix from the given SparseMatrix.
toDense() - Method in interface org.apache.spark.mllib.linalg.Vector: Converts this vector to a dense vector.
toDenseColMajor() - Method in interface org.apache.spark.ml.linalg.Matrix: Converts this matrix to a dense matrix in column major order.
toDenseMatrix(boolean) - Method in interface org.apache.spark.ml.linalg.Matrix: Converts this matrix to a dense matrix.
toDenseRowMajor() - Method in interface org.apache.spark.ml.linalg.Matrix: Converts this matrix to a dense matrix in row major order.
toDF(String...) - Method in class org.apache.spark.sql.Dataset: Converts this strongly typed collection of data to generic DataFrame with columns renamed.
toDF() - Method in class org.apache.spark.sql.Dataset: Converts this strongly typed collection of data to generic Dataframe.
toDF(Seq<String>) - Method in class org.apache.spark.sql.Dataset: Converts this strongly typed collection of data to generic DataFrame with columns renamed.
toDF() - Method in class org.apache.spark.sql.DatasetHolder
toDF(Seq<String>) - Method in class org.apache.spark.sql.DatasetHolder
toDouble(Decimal) - Method in interface org.apache.spark.sql.types.Decimal.DecimalIsConflicted
toDouble() - Method in class org.apache.spark.sql.types.Decimal
toDoubleArray() - Method in class org.apache.spark.sql.vectorized.ColumnarArray
toDS() - Method in class org.apache.spark.sql.DatasetHolder
toEdgeTriplet() - Method in class org.apache.spark.graphx.EdgeContext: Converts the edge and vertex properties into an EdgeTriplet for convenience.
toErrorString() - Method in class org.apache.spark.ExceptionFailure
toErrorString() - Method in class org.apache.spark.ExecutorLostFailure
toErrorString() - Method in class org.apache.spark.FetchFailed
toErrorString() - Static method in class org.apache.spark.Resubmitted
toErrorString() - Method in class org.apache.spark.TaskCommitDenied
toErrorString() - Method in interface org.apache.spark.TaskFailedReason: Error message displayed in the web UI.
toErrorString() - Method in class org.apache.spark.TaskKilled
toErrorString() - Static method in class org.apache.spark.TaskResultLost
toErrorString() - Static method in class org.apache.spark.UnknownReason
toFloat(Decimal) - Method in interface org.apache.spark.sql.types.Decimal.DecimalIsConflicted
toFloat() - Method in class org.apache.spark.sql.types.Decimal
toFloatArray() - Method in class org.apache.spark.sql.vectorized.ColumnarArray
toFormattedString() - Method in class org.apache.spark.streaming.Duration
toIndexedRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix: Converts to IndexedRowMatrix.
toIndexedRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix: Converts to IndexedRowMatrix.
toInputStream() - Method in interface org.apache.spark.storage.BlockData
toInputStream() - Method in class org.apache.spark.storage.DiskBlockData
toInspector(DataType) - Method in interface org.apache.spark.sql.hive.HiveInspectors
toInspector(Expression) - Method in interface org.apache.spark.sql.hive.HiveInspectors: Map the catalyst expression to ObjectInspector, however, if the expression is Literal or foldable, a constant writable object inspector returns; Otherwise, we always get the object inspector according to its data type(in catalyst)
toInspector(DataType) - Static method in class org.apache.spark.sql.hive.orc.OrcFileFormat
toInspector(Expression) - Static method in class org.apache.spark.sql.hive.orc.OrcFileFormat
toInt(Decimal) - Method in interface org.apache.spark.sql.types.Decimal.DecimalIsConflicted
toInt() - Method in class org.apache.spark.sql.types.Decimal
toInt() - Method in class org.apache.spark.storage.StorageLevel
toIntArray() - Method in class org.apache.spark.sql.vectorized.ColumnarArray
toJavaBigDecimal() - Method in class org.apache.spark.sql.types.Decimal
toJavaBigInteger() - Method in class org.apache.spark.sql.types.Decimal
toJavaDStream() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Convert to a JavaDStream
toJavaRDD() - Method in class org.apache.spark.rdd.RDD
toJavaRDD() - Method in class org.apache.spark.sql.Dataset: Returns the content of the Dataset as a JavaRDD of Ts.
toJson(Matrix) - Static method in class org.apache.spark.ml.linalg.JsonMatrixConverter: Coverts the Matrix to a JSON string.
toJson(Vector) - Static method in class org.apache.spark.ml.linalg.JsonVectorConverter: Coverts the vector to a JSON string.
toJson() - Method in class org.apache.spark.mllib.linalg.DenseVector
toJson() - Method in class org.apache.spark.mllib.linalg.SparseVector
toJson() - Method in interface org.apache.spark.mllib.linalg.Vector: Converts the vector to a JSON string.
toJSON() - Method in class org.apache.spark.sql.Dataset: Returns the content of the Dataset as a Dataset of JSON strings.
Tokenizer - Class in org.apache.spark.ml.feature: A tokenizer that converts the input string to lowercase and then splits it by white spaces.
Tokenizer(String) - Constructor for class org.apache.spark.ml.feature.Tokenizer
Tokenizer() - Constructor for class org.apache.spark.ml.feature.Tokenizer
tokens() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.UpdateDelegationTokens
tol() - Method in interface org.apache.spark.ml.param.shared.HasTol: Param for the convergence tolerance for iterative algorithms (>= 0).
toLocal() - Method in class org.apache.spark.ml.clustering.DistributedLDAModel: Convert this distributed model to a local representation.
toLocal() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel: Convert model to a local model.
toLocalIterator() - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an iterator that contains all of the elements in this RDD.
toLocalIterator() - Method in class org.apache.spark.rdd.RDD: Return an iterator that contains all of the elements in this RDD.
toLocalIterator() - Method in class org.apache.spark.sql.Dataset: Returns an iterator that contains all rows in this Dataset.
toLocalMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix: Collect the distributed matrix on the driver as a DenseMatrix.
toLong(Decimal) - Method in interface org.apache.spark.sql.types.Decimal.DecimalIsConflicted
toLong() - Method in class org.apache.spark.sql.types.Decimal
toLongArray() - Method in class org.apache.spark.sql.vectorized.ColumnarArray
toLowercase() - Method in class org.apache.spark.ml.feature.RegexTokenizer: Indicates whether to convert all characters to lowercase before tokenizing.
toMetadata(Metadata) - Method in class org.apache.spark.ml.attribute.Attribute: Converts to ML metadata with some existing metadata.
toMetadata() - Method in class org.apache.spark.ml.attribute.Attribute: Converts to ML metadata
toMetadata(Metadata) - Method in class org.apache.spark.ml.attribute.AttributeGroup: Converts to ML metadata with some existing metadata.
toMetadata() - Method in class org.apache.spark.ml.attribute.AttributeGroup: Converts to ML metadata
toMetadata(Metadata) - Static method in class org.apache.spark.ml.attribute.UnresolvedAttribute
toMetadata() - Static method in class org.apache.spark.ml.attribute.UnresolvedAttribute
toNetty() - Method in interface org.apache.spark.storage.BlockData: Returns a Netty-friendly wrapper for the block's data.
toNetty() - Method in class org.apache.spark.storage.DiskBlockData: Returns a Netty-friendly wrapper for the block's data.
toNumber(String, Function1<String, T>, String, String) - Static method in class org.apache.spark.internal.config.ConfigHelpers
toOld() - Method in interface org.apache.spark.ml.tree.DecisionTreeModel: Convert to spark.mllib DecisionTreeModel (losing some information)
toOld() - Method in interface org.apache.spark.ml.tree.Split: Convert to old Split format
tooltip(String, String) - Static method in class org.apache.spark.ui.UIUtils
ToolTips - Class in org.apache.spark.ui
ToolTips() - Constructor for class org.apache.spark.ui.ToolTips
toOps(T, ClassTag<VD>) - Method in interface org.apache.spark.graphx.impl.VertexPartitionBaseOpsConstructor
top(int, Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Returns the top k (largest) elements from this RDD as defined by the specified Comparator[T] and maintains the order.
top(int) - Method in interface org.apache.spark.api.java.JavaRDDLike: Returns the top k (largest) elements from this RDD using the natural ordering for T and maintains the order.
top(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Returns the top k (largest) elements from this RDD as defined by the specified implicit Ordering[T] and maintains the ordering.
toPairDStreamFunctions(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Static method in class org.apache.spark.streaming.dstream.DStream
topByKey(int, Ordering<V>) - Method in class org.apache.spark.mllib.rdd.MLPairRDDFunctions: Returns the top k (largest) elements for each key from this RDD as defined by the specified implicit Ordering[T].
topDocumentsPerTopic(int) - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel: Return the top documents for each topic
topicAssignments() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel: Return the top topic for each (doc, term) pair.
topicConcentration() - Method in interface org.apache.spark.ml.clustering.LDAParams: Concentration parameter (commonly named "beta" or "eta") for the prior placed on topics' distributions over terms.
topicConcentration() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
topicConcentration() - Method in class org.apache.spark.mllib.clustering.LDAModel: Concentration parameter (commonly named "beta" or "eta") for the prior placed on topics' distributions over terms.
topicConcentration() - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
topicDistribution(Vector) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel: Predicts the topic mixture distribution for a document (often called "theta" in the literature).
topicDistributionCol() - Method in interface org.apache.spark.ml.clustering.LDAParams: Output column with estimates of the topic mixture distribution for each document (often called "theta" in the literature).
topicDistributions() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel: For each document in the training set, return the distribution over topics for that document ("theta_doc").
topicDistributions(RDD<Tuple2<Object, Vector>>) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel: Predicts the topic mixture distribution for each document (often called "theta" in the literature).
topicDistributions(JavaPairRDD<Long, Vector>) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel: Java-friendly version of topicDistributions
topics() - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
topicsMatrix() - Method in class org.apache.spark.ml.clustering.LDAModel: Inferred topics, where each topic is represented by a distribution over terms.
topicsMatrix() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel: Inferred topics, where each topic is represented by a distribution over terms.
topicsMatrix() - Method in class org.apache.spark.mllib.clustering.LDAModel: Inferred topics, where each topic is represented by a distribution over terms.
topicsMatrix() - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
topK(Iterator<Tuple2<String, Object>>, int) - Static method in class org.apache.spark.streaming.util.RawTextHelper: Gets the top k words in terms of word counts.
toPMML(StreamResult) - Method in interface org.apache.spark.mllib.pmml.PMMLExportable: Export the model to the stream result in PMML format
toPMML(String) - Method in interface org.apache.spark.mllib.pmml.PMMLExportable: Export the model to a local file in PMML format
toPMML(SparkContext, String) - Method in interface org.apache.spark.mllib.pmml.PMMLExportable: Export the model to a directory on a distributed file system in PMML format
toPMML(OutputStream) - Method in interface org.apache.spark.mllib.pmml.PMMLExportable: Export the model to the OutputStream in PMML format
toPMML() - Method in interface org.apache.spark.mllib.pmml.PMMLExportable: Export the model to a String in PMML format
topNode() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
Topology - Interface in org.apache.spark.ml.ann: Trait for the artificial neural network (ANN) topology properties
topologyFile() - Method in class org.apache.spark.storage.FileBasedTopologyMapper
topologyInfo() - Method in class org.apache.spark.storage.BlockManagerId
topologyMap() - Method in class org.apache.spark.storage.FileBasedTopologyMapper
TopologyMapper - Class in org.apache.spark.storage: ::DeveloperApi:: TopologyMapper provides topology information for a given host param: conf SparkConf to get required properties, if needed
TopologyMapper(SparkConf) - Constructor for class org.apache.spark.storage.TopologyMapper
TopologyModel - Interface in org.apache.spark.ml.ann: Trait for ANN topology model
toPredict() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.PredictData
topTopicsPerDocument(int) - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel: For each document, return the top k weighted topics for that document and their weights.
toRadians(Column) - Static method in class org.apache.spark.sql.functions: Deprecated.
Use radians. Since 2.1.0.
toRadians(String) - Static method in class org.apache.spark.sql.functions: Deprecated.
Use radians. Since 2.1.0.
toRDD(JavaDoubleRDD) - Static method in class org.apache.spark.api.java.JavaDoubleRDD
toRDD(JavaPairRDD<K, V>) - Static method in class org.apache.spark.api.java.JavaPairRDD
toRDD(JavaRDD<T>) - Static method in class org.apache.spark.api.java.JavaRDD
toRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix: Converts to RowMatrix, dropping row indices after grouping by row index.
toRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix: Drops row indices and converts this matrix to a RowMatrix.
toScalaBigInt() - Method in class org.apache.spark.sql.types.Decimal
toSeq() - Method in class org.apache.spark.ml.param.ParamMap: Converts this param map to a sequence of param pairs.
toSeq() - Method in interface org.apache.spark.sql.Row: Return a Scala Seq representing the row.
toShort() - Method in class org.apache.spark.sql.types.Decimal
toShortArray() - Method in class org.apache.spark.sql.vectorized.ColumnarArray
toSparkContext(JavaSparkContext) - Static method in class org.apache.spark.api.java.JavaSparkContext
toSparse() - Method in interface org.apache.spark.ml.linalg.Matrix: Converts this matrix to a sparse matrix while maintaining the layout of the current matrix.
toSparse() - Method in interface org.apache.spark.ml.linalg.Vector: Converts this vector to a sparse vector with all explicit zeros removed.
toSparse() - Method in class org.apache.spark.mllib.linalg.DenseMatrix: Generate a SparseMatrix from the given DenseMatrix.
toSparse() - Method in interface org.apache.spark.mllib.linalg.Vector: Converts this vector to a sparse vector with all explicit zeros removed.
toSparseColMajor() - Method in interface org.apache.spark.ml.linalg.Matrix: Converts this matrix to a sparse matrix in column major order.
toSparseMatrix(boolean) - Method in interface org.apache.spark.ml.linalg.Matrix: Converts this matrix to a sparse matrix.
toSparseRowMajor() - Method in interface org.apache.spark.ml.linalg.Matrix: Converts this matrix to a sparse matrix in row major order.
toSparseWithSize(int) - Method in interface org.apache.spark.ml.linalg.Vector: Converts this vector to a sparse vector with all explicit zeros removed when the size is known.
toSparseWithSize(int) - Method in interface org.apache.spark.mllib.linalg.Vector: Converts this vector to a sparse vector with all explicit zeros removed when the size is known.
toSplit() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData
toSplitInfo(Class<?>, String, InputSplit) - Static method in class org.apache.spark.scheduler.SplitInfo
toSplitInfo(Class<?>, String, InputSplit) - Static method in class org.apache.spark.scheduler.SplitInfo
toString() - Method in class org.apache.spark.Accumulable: Deprecated.
toString() - Method in class org.apache.spark.api.java.JavaRDD
toString() - Method in class org.apache.spark.api.java.Optional
toString() - Method in class org.apache.spark.broadcast.Broadcast
toString() - Static method in class org.apache.spark.CleanAccum
toString() - Static method in class org.apache.spark.CleanBroadcast
toString() - Static method in class org.apache.spark.CleanCheckpoint
toString() - Static method in class org.apache.spark.CleanRDD
toString() - Static method in class org.apache.spark.CleanShuffle
toString() - Method in class org.apache.spark.ContextBarrierId
toString() - Static method in class org.apache.spark.ExceptionFailure
toString() - Static method in class org.apache.spark.ExecutorLostFailure
toString() - Static method in class org.apache.spark.ExecutorRegistered
toString() - Static method in class org.apache.spark.ExecutorRemoved
toString() - Static method in class org.apache.spark.FetchFailed
toString() - Method in class org.apache.spark.graphx.EdgeDirection
toString() - Method in class org.apache.spark.graphx.EdgeTriplet
toString() - Method in class org.apache.spark.ml.attribute.Attribute
toString() - Method in class org.apache.spark.ml.attribute.AttributeGroup
toString() - Static method in class org.apache.spark.ml.attribute.UnresolvedAttribute
toString() - Method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
toString() - Method in class org.apache.spark.ml.classification.GBTClassificationModel
toString() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
toString() - Method in class org.apache.spark.ml.classification.NaiveBayesModel
toString() - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel
toString() - Static method in class org.apache.spark.ml.clustering.ClusterData
toString() - Method in class org.apache.spark.ml.feature.LabeledPoint
toString() - Method in class org.apache.spark.ml.feature.RFormula
toString() - Method in class org.apache.spark.ml.feature.RFormulaModel
toString() - Method in class org.apache.spark.ml.linalg.DenseVector
toString() - Method in interface org.apache.spark.ml.linalg.Matrix: A human readable representation of the matrix
toString(int, int) - Method in interface org.apache.spark.ml.linalg.Matrix: A human readable representation of the matrix with maximum lines and width
toString() - Method in class org.apache.spark.ml.linalg.SparseVector
toString() - Method in class org.apache.spark.ml.param.Param
toString() - Method in class org.apache.spark.ml.param.ParamMap
toString() - Method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
toString() - Method in class org.apache.spark.ml.regression.GBTRegressionModel
toString() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionTrainingSummary
toString() - Method in class org.apache.spark.ml.regression.RandomForestRegressionModel
toString() - Method in interface org.apache.spark.ml.tree.DecisionTreeModel: Summary of the model
toString() - Method in class org.apache.spark.ml.tree.InternalNode
toString() - Method in class org.apache.spark.ml.tree.LeafNode
toString() - Method in interface org.apache.spark.ml.tree.TreeEnsembleModel: Summary of the model
toString() - Method in interface org.apache.spark.ml.util.Identifiable
toString() - Static method in class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$.Data
toString() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
toString() - Static method in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV1_0$.Data
toString() - Static method in class org.apache.spark.mllib.classification.NaiveBayesModel.SaveLoadV2_0$.Data
toString() - Method in class org.apache.spark.mllib.classification.SVMModel
toString() - Static method in class org.apache.spark.mllib.feature.ChiSqSelectorModel.SaveLoadV1_0$.Data
toString() - Static method in class org.apache.spark.mllib.feature.VocabWord
toString() - Method in class org.apache.spark.mllib.fpm.AssociationRules.Rule
toString() - Method in class org.apache.spark.mllib.fpm.FPGrowth.FreqItemset
toString() - Method in class org.apache.spark.mllib.linalg.DenseVector
toString() - Static method in class org.apache.spark.mllib.linalg.distributed.IndexedRow
toString() - Static method in class org.apache.spark.mllib.linalg.distributed.MatrixEntry
toString() - Method in interface org.apache.spark.mllib.linalg.Matrix: A human readable representation of the matrix
toString(int, int) - Method in interface org.apache.spark.mllib.linalg.Matrix: A human readable representation of the matrix with maximum lines and width
toString() - Method in class org.apache.spark.mllib.linalg.SparseVector
toString() - Static method in class org.apache.spark.mllib.recommendation.Rating
toString() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel: Print a summary of the model.
toString() - Static method in class org.apache.spark.mllib.regression.impl.GLMRegressionModel.SaveLoadV1_0$.Data
toString() - Method in class org.apache.spark.mllib.regression.LabeledPoint
toString() - Method in class org.apache.spark.mllib.stat.test.BinarySample
toString() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
toString() - Method in class org.apache.spark.mllib.stat.test.KolmogorovSmirnovTestResult
toString() - Method in interface org.apache.spark.mllib.stat.test.TestResult: String explaining the hypothesis test result.
toString() - Static method in class org.apache.spark.mllib.tree.configuration.Algo
toString() - Static method in class org.apache.spark.mllib.tree.configuration.EnsembleCombiningStrategy
toString() - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
toString() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
toString() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel: Print a summary of the model.
toString() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
toString() - Method in class org.apache.spark.mllib.tree.model.Node
toString() - Method in class org.apache.spark.mllib.tree.model.Predict
toString() - Method in class org.apache.spark.mllib.tree.model.Split
toString() - Method in class org.apache.spark.partial.BoundedDouble
toString() - Method in class org.apache.spark.partial.PartialResult
toString() - Static method in class org.apache.spark.rdd.CheckpointState
toString() - Static method in class org.apache.spark.rdd.DeterministicLevel
toString() - Method in class org.apache.spark.rdd.RDD
toString() - Static method in class org.apache.spark.scheduler.AskPermissionToCommitOutput
toString() - Static method in class org.apache.spark.scheduler.BlacklistedExecutor
toString() - Static method in class org.apache.spark.scheduler.ExecutorKilled
toString() - Method in class org.apache.spark.scheduler.InputFormatInfo
toString() - Static method in class org.apache.spark.scheduler.local.KillTask
toString() - Static method in class org.apache.spark.scheduler.local.ReviveOffers
toString() - Static method in class org.apache.spark.scheduler.local.StatusUpdate
toString() - Static method in class org.apache.spark.scheduler.local.StopExecutor
toString() - Static method in class org.apache.spark.scheduler.LossReasonPending
toString() - Static method in class org.apache.spark.scheduler.SchedulingMode
toString() - Static method in class org.apache.spark.scheduler.SparkListenerApplicationEnd
toString() - Static method in class org.apache.spark.scheduler.SparkListenerApplicationStart
toString() - Static method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
toString() - Static method in class org.apache.spark.scheduler.SparkListenerBlockManagerRemoved
toString() - Static method in class org.apache.spark.scheduler.SparkListenerBlockUpdated
toString() - Static method in class org.apache.spark.scheduler.SparkListenerEnvironmentUpdate
toString() - Static method in class org.apache.spark.scheduler.SparkListenerExecutorAdded
toString() - Static method in class org.apache.spark.scheduler.SparkListenerExecutorBlacklisted
toString() - Static method in class org.apache.spark.scheduler.SparkListenerExecutorBlacklistedForStage
toString() - Static method in class org.apache.spark.scheduler.SparkListenerExecutorMetricsUpdate
toString() - Static method in class org.apache.spark.scheduler.SparkListenerExecutorRemoved
toString() - Static method in class org.apache.spark.scheduler.SparkListenerExecutorUnblacklisted
toString() - Static method in class org.apache.spark.scheduler.SparkListenerJobEnd
toString() - Static method in class org.apache.spark.scheduler.SparkListenerJobStart
toString() - Static method in class org.apache.spark.scheduler.SparkListenerLogStart
toString() - Static method in class org.apache.spark.scheduler.SparkListenerNodeBlacklisted
toString() - Static method in class org.apache.spark.scheduler.SparkListenerNodeBlacklistedForStage
toString() - Static method in class org.apache.spark.scheduler.SparkListenerNodeUnblacklisted
toString() - Static method in class org.apache.spark.scheduler.SparkListenerSpeculativeTaskSubmitted
toString() - Static method in class org.apache.spark.scheduler.SparkListenerStageCompleted
toString() - Static method in class org.apache.spark.scheduler.SparkListenerStageSubmitted
toString() - Static method in class org.apache.spark.scheduler.SparkListenerTaskEnd
toString() - Static method in class org.apache.spark.scheduler.SparkListenerTaskGettingResult
toString() - Static method in class org.apache.spark.scheduler.SparkListenerTaskStart
toString() - Static method in class org.apache.spark.scheduler.SparkListenerUnpersistRDD
toString() - Method in class org.apache.spark.scheduler.SplitInfo
toString() - Static method in class org.apache.spark.scheduler.TaskLocality
toString() - Method in class org.apache.spark.SerializableWritable
toString() - Method in class org.apache.spark.sql.catalog.Column
toString() - Method in class org.apache.spark.sql.catalog.Database
toString() - Method in class org.apache.spark.sql.catalog.Function
toString() - Method in class org.apache.spark.sql.catalog.Table
toString() - Method in class org.apache.spark.sql.Column
toString() - Method in class org.apache.spark.sql.Dataset
toString() - Static method in class org.apache.spark.sql.expressions.UserDefinedFunction
toString() - Static method in class org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand
toString() - Static method in class org.apache.spark.sql.hive.execution.InsertIntoHiveDirCommand
toString() - Static method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
toString() - Static method in class org.apache.spark.sql.hive.execution.ScriptTransformationExec
toString() - Static method in class org.apache.spark.sql.hive.HiveUDAFBuffer
toString() - Method in class org.apache.spark.sql.hive.orc.OrcFileFormat
toString() - Static method in class org.apache.spark.sql.hive.RelationConversions
toString() - Static method in class org.apache.spark.sql.jdbc.JdbcType
toString() - Method in class org.apache.spark.sql.KeyValueGroupedDataset
toString() - Method in interface org.apache.spark.sql.RelationalGroupedDataset.GroupType
toString() - Method in class org.apache.spark.sql.RelationalGroupedDataset
toString() - Method in interface org.apache.spark.sql.Row
toString() - Static method in class org.apache.spark.sql.sources.And
toString() - Static method in class org.apache.spark.sql.sources.EqualNullSafe
toString() - Static method in class org.apache.spark.sql.sources.EqualTo
toString() - Static method in class org.apache.spark.sql.sources.GreaterThan
toString() - Static method in class org.apache.spark.sql.sources.GreaterThanOrEqual
toString() - Method in class org.apache.spark.sql.sources.In
toString() - Static method in class org.apache.spark.sql.sources.IsNotNull
toString() - Static method in class org.apache.spark.sql.sources.IsNull
toString() - Static method in class org.apache.spark.sql.sources.LessThan
toString() - Static method in class org.apache.spark.sql.sources.LessThanOrEqual
toString() - Static method in class org.apache.spark.sql.sources.Not
toString() - Static method in class org.apache.spark.sql.sources.Or
toString() - Static method in class org.apache.spark.sql.sources.StringContains
toString() - Static method in class org.apache.spark.sql.sources.StringEndsWith
toString() - Static method in class org.apache.spark.sql.sources.StringStartsWith
toString() - Method in class org.apache.spark.sql.sources.v2.reader.streaming.Offset
toString() - Method in class org.apache.spark.sql.streaming.SinkProgress
toString() - Method in class org.apache.spark.sql.streaming.SourceProgress
toString() - Method in class org.apache.spark.sql.streaming.StateOperatorProgress
toString() - Method in exception org.apache.spark.sql.streaming.StreamingQueryException
toString() - Method in class org.apache.spark.sql.streaming.StreamingQueryProgress
toString() - Method in class org.apache.spark.sql.streaming.StreamingQueryStatus
toString() - Static method in class org.apache.spark.sql.types.CharType
toString() - Method in class org.apache.spark.sql.types.Decimal
toString() - Method in class org.apache.spark.sql.types.DecimalType
toString() - Method in class org.apache.spark.sql.types.Metadata
toString() - Method in class org.apache.spark.sql.types.StructField
toString() - Static method in class org.apache.spark.sql.types.VarcharType
toString() - Static method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
toString() - Static method in class org.apache.spark.status.api.v1.ApplicationInfo
toString() - Method in class org.apache.spark.status.api.v1.StackTrace
toString() - Static method in class org.apache.spark.status.api.v1.ThreadStackTrace
toString() - Method in class org.apache.spark.storage.BlockId
toString() - Method in class org.apache.spark.storage.BlockManagerId
toString() - Static method in class org.apache.spark.storage.BroadcastBlockId
toString() - Static method in class org.apache.spark.storage.RDDBlockId
toString() - Method in class org.apache.spark.storage.RDDInfo
toString() - Static method in class org.apache.spark.storage.ShuffleBlockId
toString() - Static method in class org.apache.spark.storage.ShuffleDataBlockId
toString() - Static method in class org.apache.spark.storage.ShuffleIndexBlockId
toString() - Method in class org.apache.spark.storage.StorageLevel
toString() - Static method in class org.apache.spark.storage.StreamBlockId
toString() - Static method in class org.apache.spark.storage.TaskResultBlockId
toString() - Method in class org.apache.spark.streaming.Duration
toString() - Static method in class org.apache.spark.streaming.scheduler.BatchInfo
toString() - Static method in class org.apache.spark.streaming.scheduler.OutputOperationInfo
toString() - Static method in class org.apache.spark.streaming.scheduler.ReceiverInfo
toString() - Static method in class org.apache.spark.streaming.scheduler.ReceiverState
toString() - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchCompleted
toString() - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchStarted
toString() - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchSubmitted
toString() - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerOutputOperationCompleted
toString() - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerOutputOperationStarted
toString() - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverError
toString() - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStarted
toString() - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStopped
toString() - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerStreamingStarted
toString() - Method in class org.apache.spark.streaming.State
toString() - Method in class org.apache.spark.streaming.Time
toString() - Static method in class org.apache.spark.TaskCommitDenied
toString() - Static method in class org.apache.spark.TaskKilled
toString() - Static method in class org.apache.spark.TaskState
toString() - Method in class org.apache.spark.util.AccumulatorV2
toString() - Method in class org.apache.spark.util.MutablePair
toString() - Method in class org.apache.spark.util.StatCounter
toStructField(Metadata) - Method in class org.apache.spark.ml.attribute.Attribute: Converts to a StructField with some existing metadata.
toStructField() - Method in class org.apache.spark.ml.attribute.Attribute: Converts to a StructField.
toStructField(Metadata) - Method in class org.apache.spark.ml.attribute.AttributeGroup: Converts to a StructField with some existing metadata.
toStructField() - Method in class org.apache.spark.ml.attribute.AttributeGroup: Converts to a StructField.
toStructField(Metadata) - Static method in class org.apache.spark.ml.attribute.UnresolvedAttribute
toStructField() - Static method in class org.apache.spark.ml.attribute.UnresolvedAttribute
totalBlocksFetched() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetricDistributions
totalBytesRead(ShuffleReadMetrics) - Static method in class org.apache.spark.ui.jobs.ApiHelper
totalCores() - Method in class org.apache.spark.scheduler.cluster.ExecutorInfo
totalCores() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
totalCores() - Method in class org.apache.spark.status.LiveExecutor
totalCount() - Method in class org.apache.spark.util.sketch.CountMinSketch: Total count of items added to this CountMinSketch so far.
totalDelay() - Method in class org.apache.spark.status.api.v1.streaming.BatchInfo
totalDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo: Time taken for all the jobs of this batch to finish processing from the time they were submitted.
totalDiskSize() - Method in class org.apache.spark.ui.storage.ExecutorStreamSummary
totalDuration() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
totalDuration() - Method in class org.apache.spark.status.LiveExecutor
totalGCTime() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
totalGcTime() - Method in class org.apache.spark.status.LiveExecutor
totalInputBytes() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
totalInputBytes() - Method in class org.apache.spark.status.LiveExecutor
totalIterations() - Method in interface org.apache.spark.ml.classification.LogisticRegressionTrainingSummary: Number of training iterations.
totalIterations() - Method in class org.apache.spark.ml.regression.LinearRegressionTrainingSummary: Number of training iterations until termination
totalMemSize() - Method in class org.apache.spark.ui.storage.ExecutorStreamSummary
totalNumNodes() - Method in interface org.apache.spark.ml.tree.TreeEnsembleModel: Total number of nodes, summed over all trees in the ensemble.
totalOffHeap() - Method in class org.apache.spark.status.LiveExecutor
totalOffHeapStorageMemory() - Method in interface org.apache.spark.SparkExecutorInfo
totalOffHeapStorageMemory() - Method in class org.apache.spark.SparkExecutorInfoImpl
totalOffHeapStorageMemory() - Method in class org.apache.spark.status.api.v1.MemoryMetrics
totalOnHeap() - Method in class org.apache.spark.status.LiveExecutor
totalOnHeapStorageMemory() - Method in interface org.apache.spark.SparkExecutorInfo
totalOnHeapStorageMemory() - Method in class org.apache.spark.SparkExecutorInfoImpl
totalOnHeapStorageMemory() - Method in class org.apache.spark.status.api.v1.MemoryMetrics
totalShuffleRead() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
totalShuffleRead() - Method in class org.apache.spark.status.LiveExecutor
totalShuffleWrite() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
totalShuffleWrite() - Method in class org.apache.spark.status.LiveExecutor
totalTasks() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
totalTasks() - Method in class org.apache.spark.status.LiveExecutor
toTuple() - Method in class org.apache.spark.graphx.EdgeTriplet
toTypeInfo() - Method in class org.apache.spark.sql.hive.HiveInspectors.typeInfoConversions
toUnscaledLong() - Method in class org.apache.spark.sql.types.Decimal
toVirtualHosts(Seq<String>) - Static method in class org.apache.spark.ui.JettyUtils
train(RDD<ALS.Rating<ID>>, int, int, int, int, double, boolean, double, boolean, StorageLevel, StorageLevel, int, long, ClassTag<ID>, Ordering<ID>) - Static method in class org.apache.spark.ml.recommendation.ALS: :: DeveloperApi :: Implementation of the ALS algorithm.
train(RDD<LabeledPoint>, int, double, double, Vector) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD: Train a logistic regression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD: Train a logistic regression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD: Train a logistic regression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD: Train a logistic regression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.classification.NaiveBayes: Trains a Naive Bayes model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, double) - Static method in class org.apache.spark.mllib.classification.NaiveBayes: Trains a Naive Bayes model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, double, String) - Static method in class org.apache.spark.mllib.classification.NaiveBayes: Trains a Naive Bayes model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD: Train a SVM model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD: Train a SVM model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD: Train a SVM model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD: Train a SVM model given an RDD of (label, features) pairs.
train(RDD<Vector>, int, int, String, long) - Static method in class org.apache.spark.mllib.clustering.KMeans: Trains a k-means model using the given set of parameters.
train(RDD<Vector>, int, int, String) - Static method in class org.apache.spark.mllib.clustering.KMeans: Trains a k-means model using the given set of parameters.
train(RDD<Vector>, int, int, int, String, long) - Static method in class org.apache.spark.mllib.clustering.KMeans: Deprecated.
Use train method without 'runs'. Since 2.1.0.
train(RDD<Vector>, int, int, int, String) - Static method in class org.apache.spark.mllib.clustering.KMeans: Deprecated.
Use train method without 'runs'. Since 2.1.0.
train(RDD<Vector>, int, int) - Static method in class org.apache.spark.mllib.clustering.KMeans: Trains a k-means model using specified parameters and the default values for unspecified.
train(RDD<Vector>, int, int, int) - Static method in class org.apache.spark.mllib.clustering.KMeans: Deprecated.
Use train method without 'runs'. Since 2.1.0.
train(RDD<Rating>, int, int, double, int, long) - Static method in class org.apache.spark.mllib.recommendation.ALS: Train a matrix factorization model given an RDD of ratings by users for a subset of products.
train(RDD<Rating>, int, int, double, int) - Static method in class org.apache.spark.mllib.recommendation.ALS: Train a matrix factorization model given an RDD of ratings by users for a subset of products.
train(RDD<Rating>, int, int, double) - Static method in class org.apache.spark.mllib.recommendation.ALS: Train a matrix factorization model given an RDD of ratings by users for a subset of products.
train(RDD<Rating>, int, int) - Static method in class org.apache.spark.mllib.recommendation.ALS: Train a matrix factorization model given an RDD of ratings by users for a subset of products.
train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD: Train a Lasso model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD: Train a Lasso model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD: Train a Lasso model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD: Train a Lasso model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD: Train a Linear Regression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD: Train a LinearRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD: Train a LinearRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD: Train a LinearRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD: Train a RidgeRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD: Train a RidgeRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD: Train a RidgeRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD: Train a RidgeRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, Strategy) - Static method in class org.apache.spark.mllib.tree.DecisionTree: Method to train a decision tree model.
train(RDD<LabeledPoint>, Enumeration.Value, Impurity, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree: Method to train a decision tree model.
train(RDD<LabeledPoint>, Enumeration.Value, Impurity, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree: Method to train a decision tree model.
train(RDD<LabeledPoint>, Enumeration.Value, Impurity, int, int, int, Enumeration.Value, Map<Object, Object>) - Static method in class org.apache.spark.mllib.tree.DecisionTree: Method to train a decision tree model.
train(RDD<LabeledPoint>, BoostingStrategy) - Static method in class org.apache.spark.mllib.tree.GradientBoostedTrees: Method to train a gradient boosting model.
train(JavaRDD<LabeledPoint>, BoostingStrategy) - Static method in class org.apache.spark.mllib.tree.GradientBoostedTrees: Java-friendly API for org.apache.spark.mllib.tree.GradientBoostedTrees.train
trainClassifier(RDD<LabeledPoint>, int, Map<Object, Object>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree: Method to train a decision tree model for binary or multiclass classification.
trainClassifier(JavaRDD<LabeledPoint>, int, Map<Integer, Integer>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree: Java-friendly API for org.apache.spark.mllib.tree.DecisionTree.trainClassifier
trainClassifier(RDD<LabeledPoint>, Strategy, int, String, int) - Static method in class org.apache.spark.mllib.tree.RandomForest: Method to train a decision tree model for binary or multiclass classification.
trainClassifier(RDD<LabeledPoint>, int, Map<Object, Object>, int, String, String, int, int, int) - Static method in class org.apache.spark.mllib.tree.RandomForest: Method to train a decision tree model for binary or multiclass classification.
trainClassifier(JavaRDD<LabeledPoint>, int, Map<Integer, Integer>, int, String, String, int, int, int) - Static method in class org.apache.spark.mllib.tree.RandomForest: Java-friendly API for org.apache.spark.mllib.tree.RandomForest.trainClassifier
trainImplicit(RDD<Rating>, int, int, double, int, double, long) - Static method in class org.apache.spark.mllib.recommendation.ALS: Train a matrix factorization model given an RDD of 'implicit preferences' given by users to some products, in the form of (userID, productID, preference) pairs.
trainImplicit(RDD<Rating>, int, int, double, int, double) - Static method in class org.apache.spark.mllib.recommendation.ALS: Train a matrix factorization model given an RDD of 'implicit preferences' of users for a subset of products.
trainImplicit(RDD<Rating>, int, int, double, double) - Static method in class org.apache.spark.mllib.recommendation.ALS: Train a matrix factorization model given an RDD of 'implicit preferences' of users for a subset of products.
trainImplicit(RDD<Rating>, int, int) - Static method in class org.apache.spark.mllib.recommendation.ALS: Train a matrix factorization model given an RDD of 'implicit preferences' of users for a subset of products.
trainingCost() - Method in class org.apache.spark.ml.clustering.KMeansSummary
trainingCost() - Method in class org.apache.spark.mllib.clustering.KMeansModel
trainingLogLikelihood() - Method in class org.apache.spark.ml.clustering.DistributedLDAModel: Log likelihood of the observed tokens in the training set, given the current parameter estimates: log P(docs | topics, topic distributions for docs, Dirichlet hyperparameters)
trainOn(DStream<Vector>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans: Update the clustering model by training on batches of data from a DStream.
trainOn(JavaDStream<Vector>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans: Java-friendly version of trainOn.
trainOn(DStream<LabeledPoint>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm: Update the model by training on batches of data from a DStream.
trainOn(JavaDStream<LabeledPoint>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm: Java-friendly version of trainOn.
trainRatio() - Method in interface org.apache.spark.ml.tuning.TrainValidationSplitParams: Param for ratio between train and validation data.
trainRegressor(RDD<LabeledPoint>, Map<Object, Object>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree: Method to train a decision tree model for regression.
trainRegressor(JavaRDD<LabeledPoint>, Map<Integer, Integer>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree: Java-friendly API for org.apache.spark.mllib.tree.DecisionTree.trainRegressor
trainRegressor(RDD<LabeledPoint>, Strategy, int, String, int) - Static method in class org.apache.spark.mllib.tree.RandomForest: Method to train a decision tree model for regression.
trainRegressor(RDD<LabeledPoint>, Map<Object, Object>, int, String, String, int, int, int) - Static method in class org.apache.spark.mllib.tree.RandomForest: Method to train a decision tree model for regression.
trainRegressor(JavaRDD<LabeledPoint>, Map<Integer, Integer>, int, String, String, int, int, int) - Static method in class org.apache.spark.mllib.tree.RandomForest: Java-friendly API for org.apache.spark.mllib.tree.RandomForest.trainRegressor
TrainValidationSplit - Class in org.apache.spark.ml.tuning: Validation for hyper-parameter tuning.
TrainValidationSplit(String) - Constructor for class org.apache.spark.ml.tuning.TrainValidationSplit
TrainValidationSplit() - Constructor for class org.apache.spark.ml.tuning.TrainValidationSplit
TrainValidationSplitModel - Class in org.apache.spark.ml.tuning: Model from train validation split.
TrainValidationSplitModel.TrainValidationSplitModelWriter - Class in org.apache.spark.ml.tuning: Writer for TrainValidationSplitModel.
TrainValidationSplitParams - Interface in org.apache.spark.ml.tuning: Params for TrainValidationSplit and TrainValidationSplitModel.
transferred() - Method in class org.apache.spark.storage.ReadableChannelFileRegion
transferTo(WritableByteChannel, long) - Method in class org.apache.spark.storage.ReadableChannelFileRegion
transform(Function1<Try<T>, Try<S>>, ExecutionContext) - Method in class org.apache.spark.ComplexFutureAction
transform(Function1<Try<T>, Try<S>>, ExecutionContext) - Method in interface org.apache.spark.FutureAction
transform(Dataset<?>) - Method in class org.apache.spark.ml.classification.ClassificationModel: Transforms dataset by reading from featuresCol, and appending new columns as specified by parameters: - predicted labels as predictionCol of type Double - raw predictions (confidences) as rawPredictionCol of type Vector.
transform(Dataset<?>) - Method in class org.apache.spark.ml.classification.OneVsRestModel
transform(Dataset<?>) - Method in class org.apache.spark.ml.classification.ProbabilisticClassificationModel: Transforms dataset by reading from featuresCol, and appending new columns as specified by parameters: - predicted labels as predictionCol of type Double - raw predictions (confidences) as rawPredictionCol of type Vector - probability of each class as probabilityCol of type Vector.
transform(Dataset<?>) - Method in class org.apache.spark.ml.clustering.BisectingKMeansModel
transform(Dataset<?>) - Method in class org.apache.spark.ml.clustering.GaussianMixtureModel
transform(Dataset<?>) - Method in class org.apache.spark.ml.clustering.KMeansModel
transform(Dataset<?>) - Method in class org.apache.spark.ml.clustering.LDAModel: Transforms the input dataset.
transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.Binarizer
transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.Bucketizer
transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.ChiSqSelectorModel
transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.ColumnPruner
transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.CountVectorizerModel
transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.FeatureHasher
transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.HashingTF
transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.IDFModel
transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.ImputerModel
transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.IndexToString
transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.Interaction
transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.MaxAbsScalerModel
transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.MinMaxScalerModel
transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.OneHotEncoder: Deprecated.
transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.OneHotEncoderModel
transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.PCAModel: Transform a vector by computed Principal Components.
transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.RFormulaModel
transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.SQLTransformer
transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.StandardScalerModel
transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.StopWordsRemover
transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.StringIndexerModel
transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.VectorAssembler
transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.VectorAttributeRewriter
transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.VectorIndexerModel
transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.VectorSizeHint
transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.VectorSlicer
transform(Dataset<?>) - Method in class org.apache.spark.ml.feature.Word2VecModel: Transform a sentence column to a vector column to represent the whole sentence.
transform(Dataset<?>) - Method in class org.apache.spark.ml.fpm.FPGrowthModel: The transform method first generates the association rules according to the frequent itemsets.
transform(Dataset<?>) - Method in class org.apache.spark.ml.PipelineModel
transform(Dataset<?>) - Method in class org.apache.spark.ml.PredictionModel: Transforms dataset by reading from featuresCol, calling predict, and storing the predictions as a new column predictionCol.
transform(Dataset<?>) - Method in class org.apache.spark.ml.recommendation.ALSModel
transform(Dataset<?>) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
transform(Dataset<?>) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
transform(Dataset<?>) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionModel
transform(Dataset<?>) - Method in class org.apache.spark.ml.regression.IsotonicRegressionModel
transform(Dataset<?>, ParamPair<?>, ParamPair<?>...) - Method in class org.apache.spark.ml.Transformer: Transforms the dataset with optional parameters
transform(Dataset<?>, ParamPair<?>, Seq<ParamPair<?>>) - Method in class org.apache.spark.ml.Transformer: Transforms the dataset with optional parameters
transform(Dataset<?>, ParamMap) - Method in class org.apache.spark.ml.Transformer: Transforms the dataset with provided parameter map as additional parameters.
transform(Dataset<?>) - Method in class org.apache.spark.ml.Transformer: Transforms the input dataset.
transform(Dataset<?>) - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
transform(Dataset<?>) - Method in class org.apache.spark.ml.tuning.TrainValidationSplitModel
transform(Dataset<?>) - Method in class org.apache.spark.ml.UnaryTransformer
transform(Vector) - Method in class org.apache.spark.mllib.feature.ChiSqSelectorModel: Applies transformation on a vector.
transform(Vector) - Method in class org.apache.spark.mllib.feature.ElementwiseProduct: Does the hadamard product transformation.
transform(Iterable<?>) - Method in class org.apache.spark.mllib.feature.HashingTF: Transforms the input document into a sparse term frequency vector.
transform(Iterable<?>) - Method in class org.apache.spark.mllib.feature.HashingTF: Transforms the input document into a sparse term frequency vector (Java version).
transform(RDD<D>) - Method in class org.apache.spark.mllib.feature.HashingTF: Transforms the input document to term frequency vectors.
transform(JavaRDD<D>) - Method in class org.apache.spark.mllib.feature.HashingTF: Transforms the input document to term frequency vectors (Java version).
transform(RDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDFModel: Transforms term frequency (TF) vectors to TF-IDF vectors.
transform(Vector) - Method in class org.apache.spark.mllib.feature.IDFModel: Transforms a term frequency (TF) vector to a TF-IDF vector
transform(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDFModel: Transforms term frequency (TF) vectors to TF-IDF vectors (Java version).
transform(Vector) - Method in class org.apache.spark.mllib.feature.Normalizer: Applies unit length normalization on a vector.
transform(Vector) - Method in class org.apache.spark.mllib.feature.PCAModel: Transform a vector by computed Principal Components.
transform(Vector) - Method in class org.apache.spark.mllib.feature.StandardScalerModel: Applies standardization transformation on a vector.
transform(Vector) - Method in interface org.apache.spark.mllib.feature.VectorTransformer: Applies transformation on a vector.
transform(RDD<Vector>) - Method in interface org.apache.spark.mllib.feature.VectorTransformer: Applies transformation on an RDD[Vector].
transform(JavaRDD<Vector>) - Method in interface org.apache.spark.mllib.feature.VectorTransformer: Applies transformation on a JavaRDD[Vector].
transform(String) - Method in class org.apache.spark.mllib.feature.Word2VecModel: Transforms a word to its vector representation
transform(Function1<Try<T>, Try<S>>, ExecutionContext) - Method in class org.apache.spark.SimpleFutureAction
transform(Function1<Dataset<T>, Dataset>) - Method in class org.apache.spark.sql.Dataset: Concise syntax for chaining custom transformations.
transform(Function<R, JavaRDD>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transform(Function2<R, Time, JavaRDD>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transform(List<JavaDStream<?>>, Function2<List<JavaRDD<?>>, Time, JavaRDD<T>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a new DStream in which each RDD is generated by applying a function on RDDs of the DStreams.
transform(Function1<RDD<T>, RDD>, ClassTag) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transform(Function2<RDD<T>, Time, RDD>, ClassTag) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transform(Seq<DStream<?>>, Function2<Seq<RDD<?>>, Time, RDD<T>>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext: Create a new DStream in which each RDD is generated by applying a function on RDDs of the DStreams.
Transformer - Class in org.apache.spark.ml: :: DeveloperApi :: Abstract class for transformers that transform one dataset into another.
Transformer() - Constructor for class org.apache.spark.ml.Transformer
transformOutputColumnSchema(StructField, String, boolean, boolean) - Static method in class org.apache.spark.ml.feature.OneHotEncoderCommon: Prepares the StructField with proper metadata for OneHotEncoder's output column.
transformSchema(StructType) - Method in class org.apache.spark.ml.classification.OneVsRest
transformSchema(StructType) - Method in class org.apache.spark.ml.classification.OneVsRestModel
transformSchema(StructType) - Method in class org.apache.spark.ml.clustering.BisectingKMeans
transformSchema(StructType) - Method in class org.apache.spark.ml.clustering.BisectingKMeansModel
transformSchema(StructType) - Method in class org.apache.spark.ml.clustering.GaussianMixture
transformSchema(StructType) - Method in class org.apache.spark.ml.clustering.GaussianMixtureModel
transformSchema(StructType) - Method in class org.apache.spark.ml.clustering.KMeans
transformSchema(StructType) - Method in class org.apache.spark.ml.clustering.KMeansModel
transformSchema(StructType) - Method in class org.apache.spark.ml.clustering.LDA
transformSchema(StructType) - Method in class org.apache.spark.ml.clustering.LDAModel
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.Binarizer
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSH
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.Bucketizer
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.ChiSqSelector
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.ChiSqSelectorModel
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.ColumnPruner
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.CountVectorizer
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.CountVectorizerModel
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.FeatureHasher
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.HashingTF
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.IDF
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.IDFModel
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.Imputer
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.ImputerModel
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.IndexToString
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.Interaction
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.MaxAbsScaler
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.MaxAbsScalerModel
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.MinHashLSH
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.MinMaxScaler
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.MinMaxScalerModel
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.OneHotEncoder: Deprecated.
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.OneHotEncoderEstimator
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.OneHotEncoderModel
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.PCA
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.PCAModel
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.QuantileDiscretizer
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.RFormula
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.RFormulaModel
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.SQLTransformer
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.StandardScaler
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.StandardScalerModel
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.StopWordsRemover
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.StringIndexer
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.StringIndexerModel
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.VectorAssembler
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.VectorAttributeRewriter
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.VectorIndexer
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.VectorIndexerModel
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.VectorSizeHint
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.VectorSlicer
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.Word2Vec
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.Word2VecModel
transformSchema(StructType) - Method in class org.apache.spark.ml.fpm.FPGrowth
transformSchema(StructType) - Method in class org.apache.spark.ml.fpm.FPGrowthModel
transformSchema(StructType) - Method in class org.apache.spark.ml.Pipeline
transformSchema(StructType) - Method in class org.apache.spark.ml.PipelineModel
transformSchema(StructType) - Method in class org.apache.spark.ml.PipelineStage: :: DeveloperApi ::
transformSchema(StructType) - Method in class org.apache.spark.ml.PredictionModel
transformSchema(StructType) - Method in class org.apache.spark.ml.Predictor
transformSchema(StructType) - Method in class org.apache.spark.ml.recommendation.ALS
transformSchema(StructType) - Method in class org.apache.spark.ml.recommendation.ALSModel
transformSchema(StructType) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
transformSchema(StructType) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
transformSchema(StructType) - Method in class org.apache.spark.ml.regression.IsotonicRegression
transformSchema(StructType) - Method in class org.apache.spark.ml.regression.IsotonicRegressionModel
transformSchema(StructType) - Method in class org.apache.spark.ml.tuning.CrossValidator
transformSchema(StructType) - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
transformSchema(StructType) - Method in class org.apache.spark.ml.tuning.TrainValidationSplit
transformSchema(StructType) - Method in class org.apache.spark.ml.tuning.TrainValidationSplitModel
transformSchema(StructType) - Method in class org.apache.spark.ml.UnaryTransformer
transformSchemaImpl(StructType) - Method in interface org.apache.spark.ml.tuning.ValidatorParams
transformToPair(Function<R, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transformToPair(Function2<R, Time, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transformToPair(List<JavaDStream<?>>, Function2<List<JavaRDD<?>>, Time, JavaPairRDD<K, V>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a new DStream in which each RDD is generated by applying a function on RDDs of the DStreams.
transformWith(Function1<Try<T>, Future<S>>, ExecutionContext) - Method in class org.apache.spark.ComplexFutureAction
transformWith(Function1<Try<T>, Future<S>>, ExecutionContext) - Method in interface org.apache.spark.FutureAction
transformWith(Function1<Try<T>, Future<S>>, ExecutionContext) - Method in class org.apache.spark.SimpleFutureAction
transformWith(JavaDStream, Function3<R, JavaRDD, Time, JavaRDD<W>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
transformWith(JavaPairDStream<K2, V2>, Function3<R, JavaPairRDD<K2, V2>, Time, JavaRDD<W>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
transformWith(DStream, Function2<RDD<T>, RDD, RDD<V>>, ClassTag, ClassTag<V>) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
transformWith(DStream, Function3<RDD<T>, RDD, Time, RDD<V>>, ClassTag, ClassTag<V>) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
transformWithToPair(JavaDStream, Function3<R, JavaRDD, Time, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
transformWithToPair(JavaPairDStream<K2, V2>, Function3<R, JavaPairRDD<K2, V2>, Time, JavaPairRDD<K3, V3>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
translate(Column, String, String) - Static method in class org.apache.spark.sql.functions: Translate any character in the src by a character in replaceString.
transpose() - Method in class org.apache.spark.ml.linalg.DenseMatrix
transpose() - Method in interface org.apache.spark.ml.linalg.Matrix: Transpose the Matrix.
transpose() - Method in class org.apache.spark.ml.linalg.SparseMatrix
transpose() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
transpose() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix: Transpose this BlockMatrix.
transpose() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix: Transposes this CoordinateMatrix.
transpose() - Method in interface org.apache.spark.mllib.linalg.Matrix: Transpose the Matrix.
transpose() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
treeAggregate(U, Function2<U, T, U>, Function2<U, U, U>, int) - Method in interface org.apache.spark.api.java.JavaRDDLike: Aggregates the elements of this RDD in a multi-level tree pattern.
treeAggregate(U, Function2<U, T, U>, Function2<U, U, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike: org.apache.spark.api.java.JavaRDDLike.treeAggregate with suggested depth 2.
treeAggregate(U, Function2<U, T, U>, Function2<U, U, U>, int, ClassTag) - Method in class org.apache.spark.rdd.RDD: Aggregates the elements of this RDD in a multi-level tree pattern.
TreeClassifierParams - Interface in org.apache.spark.ml.tree: Parameters for Decision Tree-based classification algorithms.
TreeEnsembleModel<M extends DecisionTreeModel> - Interface in org.apache.spark.ml.tree: Abstraction for models which are ensembles of decision trees
TreeEnsembleParams - Interface in org.apache.spark.ml.tree: Parameters for Decision Tree-based ensemble algorithms.
treeID() - Method in class org.apache.spark.ml.tree.EnsembleModelReadWrite.EnsembleNodeData
treeId() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.NodeData
treeReduce(Function2<T, T, T>, int) - Method in interface org.apache.spark.api.java.JavaRDDLike: Reduces the elements of this RDD in a multi-level tree pattern.
treeReduce(Function2<T, T, T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: org.apache.spark.api.java.JavaRDDLike.treeReduce with suggested depth 2.
treeReduce(Function2<T, T, T>, int) - Method in class org.apache.spark.rdd.RDD: Reduces the elements of this RDD in a multi-level tree pattern.
TreeRegressorParams - Interface in org.apache.spark.ml.tree: Parameters for Decision Tree-based regression algorithms.
trees() - Method in class org.apache.spark.ml.classification.GBTClassificationModel
trees() - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel
trees() - Method in class org.apache.spark.ml.regression.GBTRegressionModel
trees() - Method in class org.apache.spark.ml.regression.RandomForestRegressionModel
trees() - Method in interface org.apache.spark.ml.tree.TreeEnsembleModel: Trees in this ensemble.
trees() - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
trees() - Method in class org.apache.spark.mllib.tree.model.RandomForestModel
treeStrategy() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
treeString() - Method in class org.apache.spark.sql.types.StructType
treeWeights() - Method in class org.apache.spark.ml.classification.GBTClassificationModel
treeWeights() - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel
treeWeights() - Method in class org.apache.spark.ml.regression.GBTRegressionModel
treeWeights() - Method in class org.apache.spark.ml.regression.RandomForestRegressionModel
treeWeights() - Method in interface org.apache.spark.ml.tree.TreeEnsembleModel: Weights for each tree, zippable with trees
treeWeights() - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
triangleCount() - Method in class org.apache.spark.graphx.GraphOps: Compute the number of triangles passing through each vertex.
TriangleCount - Class in org.apache.spark.graphx.lib: Compute the number of triangles passing through each vertex.
TriangleCount() - Constructor for class org.apache.spark.graphx.lib.TriangleCount
trigger(Trigger) - Method in class org.apache.spark.sql.streaming.DataStreamWriter: Set the trigger for the stream query.
Trigger - Class in org.apache.spark.sql.streaming: Policy used to indicate how often results should be produced by a [[StreamingQuery]].
Trigger() - Constructor for class org.apache.spark.sql.streaming.Trigger
TriggerThreadDump$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.TriggerThreadDump$
trim(Column) - Static method in class org.apache.spark.sql.functions: Trim the spaces from both ends for the specified string column.
trim(Column, String) - Static method in class org.apache.spark.sql.functions: Trim the specified character from both ends for the specified string column.
TrimHorizon() - Constructor for class org.apache.spark.streaming.kinesis.KinesisInitialPositions.TrimHorizon
TripletFields - Class in org.apache.spark.graphx: Represents a subset of the fields of an [[EdgeTriplet]] or [[EdgeContext]].
TripletFields() - Constructor for class org.apache.spark.graphx.TripletFields: Constructs a default TripletFields in which all fields are included.
TripletFields(boolean, boolean, boolean) - Constructor for class org.apache.spark.graphx.TripletFields
triplets() - Method in class org.apache.spark.graphx.Graph: An RDD containing the edge triplets, which are edges along with the vertex data associated with the adjacent vertices.
triplets() - Method in class org.apache.spark.graphx.impl.GraphImpl: Return an RDD that brings edges together with their source and destination vertices.
truePositiveRate(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns true positive rate for a given label (category)
truePositiveRateByLabel() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary: Returns true positive rate for each label (category).
trunc(Column, String) - Static method in class org.apache.spark.sql.functions: Returns date truncated to the unit specified by the format.
truncatedString(Seq<T>, String, String, String, int) - Static method in class org.apache.spark.util.Utils: Format a sequence with semantics similar to calling .mkString().
truncatedString(Seq<T>, String) - Static method in class org.apache.spark.util.Utils: Shorthand for calling truncatedString() without start or end strings.
tryLog(Function0<T>) - Static method in class org.apache.spark.util.Utils: Executes the given block in a Try, logging any uncaught exceptions.
tryLogNonFatalError(Function0<BoxedUnit>) - Static method in class org.apache.spark.util.Utils: Executes the given block.
tryOrExit(Function0<BoxedUnit>) - Static method in class org.apache.spark.util.Utils: Execute a block of code that evaluates to Unit, forwarding any uncaught exceptions to the default UncaughtExceptionHandler
tryOrIOException(Function0<T>) - Static method in class org.apache.spark.util.Utils: Execute a block of code that returns a value, re-throwing any non-fatal uncaught exceptions as IOException.
tryOrStopSparkContext(SparkContext, Function0<BoxedUnit>) - Static method in class org.apache.spark.util.Utils: Execute a block of code that evaluates to Unit, stop SparkContext if there is any uncaught exception
tryRecoverFromCheckpoint(String) - Method in class org.apache.spark.streaming.StreamingContextPythonHelper: This is a private method only for Python to implement getOrCreate.
tryWithResource(Function0<R>, Function1<R, T>) - Static method in class org.apache.spark.util.Utils
tryWithSafeFinally(Function0<T>, Function0<BoxedUnit>) - Static method in class org.apache.spark.util.Utils: Execute a block of code, then a finally block, but if exceptions happen in the finally block, do not suppress the original exception.
tryWithSafeFinallyAndFailureCallbacks(Function0<T>, Function0<BoxedUnit>, Function0<BoxedUnit>) - Static method in class org.apache.spark.util.Utils: Execute a block of code and call the failure callbacks in the catch block.
tuple(Encoder<T1>, Encoder<T2>) - Static method in class org.apache.spark.sql.Encoders: An encoder for 2-ary tuples.
tuple(Encoder<T1>, Encoder<T2>, Encoder<T3>) - Static method in class org.apache.spark.sql.Encoders: An encoder for 3-ary tuples.
tuple(Encoder<T1>, Encoder<T2>, Encoder<T3>, Encoder<T4>) - Static method in class org.apache.spark.sql.Encoders: An encoder for 4-ary tuples.
tuple(Encoder<T1>, Encoder<T2>, Encoder<T3>, Encoder<T4>, Encoder<T5>) - Static method in class org.apache.spark.sql.Encoders: An encoder for 5-ary tuples.
tValues() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionTrainingSummary: T-statistic of estimated coefficients and intercept.
tValues() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary: T-statistic of estimated coefficients and intercept.
Tweedie$() - Constructor for class org.apache.spark.ml.regression.GeneralizedLinearRegression.Tweedie$
TYPE() - Static method in class org.apache.spark.ml.attribute.AttributeKeys
typed - Class in org.apache.spark.sql.expressions.javalang: :: Experimental :: Type-safe functions available for Dataset operations in Java.
typed() - Constructor for class org.apache.spark.sql.expressions.javalang.typed
typed - Class in org.apache.spark.sql.expressions.scalalang: :: Experimental :: Type-safe functions available for Dataset operations in Scala.
typed() - Constructor for class org.apache.spark.sql.expressions.scalalang.typed
TypedColumn<T,U> - Class in org.apache.spark.sql: A Column where an Encoder has been given for the expected input and return type.
TypedColumn(Expression, ExpressionEncoder) - Constructor for class org.apache.spark.sql.TypedColumn
typedLit(T, TypeTags.TypeTag<T>) - Static method in class org.apache.spark.sql.functions: Creates a Column of literal value.
typeInfoConversions(DataType) - Constructor for class org.apache.spark.sql.hive.HiveInspectors.typeInfoConversions
typeInfoConversions(DataType) - Static method in class org.apache.spark.sql.hive.orc.OrcFileFormat
typeName() - Method in class org.apache.spark.mllib.linalg.VectorUDT
typeName() - Static method in class org.apache.spark.sql.types.BinaryType
typeName() - Static method in class org.apache.spark.sql.types.BooleanType
typeName() - Static method in class org.apache.spark.sql.types.ByteType
typeName() - Static method in class org.apache.spark.sql.types.CalendarIntervalType
typeName() - Method in class org.apache.spark.sql.types.DataType: Name of the type used in JSON serialization.
typeName() - Static method in class org.apache.spark.sql.types.DateType
typeName() - Method in class org.apache.spark.sql.types.DecimalType
typeName() - Static method in class org.apache.spark.sql.types.DoubleType
typeName() - Static method in class org.apache.spark.sql.types.FloatType
typeName() - Static method in class org.apache.spark.sql.types.IntegerType
typeName() - Static method in class org.apache.spark.sql.types.LongType
typeName() - Static method in class org.apache.spark.sql.types.NullType
typeName() - Static method in class org.apache.spark.sql.types.ShortType
typeName() - Static method in class org.apache.spark.sql.types.StringType
typeName() - Static method in class org.apache.spark.sql.types.TimestampType

U

U() - Method in class org.apache.spark.mllib.linalg.SingularValueDecomposition
udf(Function0<RT>, TypeTags.TypeTag<RT>) - Static method in class org.apache.spark.sql.functions: Defines a Scala closure of 0 arguments as user-defined function (UDF).
udf(Function1<A1, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>) - Static method in class org.apache.spark.sql.functions: Defines a Scala closure of 1 arguments as user-defined function (UDF).
udf(Function2<A1, A2, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>) - Static method in class org.apache.spark.sql.functions: Defines a Scala closure of 2 arguments as user-defined function (UDF).
udf(Function3<A1, A2, A3, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>) - Static method in class org.apache.spark.sql.functions: Defines a Scala closure of 3 arguments as user-defined function (UDF).
udf(Function4<A1, A2, A3, A4, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>) - Static method in class org.apache.spark.sql.functions: Defines a Scala closure of 4 arguments as user-defined function (UDF).
udf(Function5<A1, A2, A3, A4, A5, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>) - Static method in class org.apache.spark.sql.functions: Defines a Scala closure of 5 arguments as user-defined function (UDF).
udf(Function6<A1, A2, A3, A4, A5, A6, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>) - Static method in class org.apache.spark.sql.functions: Defines a Scala closure of 6 arguments as user-defined function (UDF).
udf(Function7<A1, A2, A3, A4, A5, A6, A7, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>) - Static method in class org.apache.spark.sql.functions: Defines a Scala closure of 7 arguments as user-defined function (UDF).
udf(Function8<A1, A2, A3, A4, A5, A6, A7, A8, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>) - Static method in class org.apache.spark.sql.functions: Defines a Scala closure of 8 arguments as user-defined function (UDF).
udf(Function9<A1, A2, A3, A4, A5, A6, A7, A8, A9, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>) - Static method in class org.apache.spark.sql.functions: Defines a Scala closure of 9 arguments as user-defined function (UDF).
udf(Function10<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>) - Static method in class org.apache.spark.sql.functions: Defines a Scala closure of 10 arguments as user-defined function (UDF).
udf(UDF0<?>, DataType) - Static method in class org.apache.spark.sql.functions: Defines a Java UDF0 instance as user-defined function (UDF).
udf(UDF1<?, ?>, DataType) - Static method in class org.apache.spark.sql.functions: Defines a Java UDF1 instance as user-defined function (UDF).
udf(UDF2<?, ?, ?>, DataType) - Static method in class org.apache.spark.sql.functions: Defines a Java UDF2 instance as user-defined function (UDF).
udf(UDF3<?, ?, ?, ?>, DataType) - Static method in class org.apache.spark.sql.functions: Defines a Java UDF3 instance as user-defined function (UDF).
udf(UDF4<?, ?, ?, ?, ?>, DataType) - Static method in class org.apache.spark.sql.functions: Defines a Java UDF4 instance as user-defined function (UDF).
udf(UDF5<?, ?, ?, ?, ?, ?>, DataType) - Static method in class org.apache.spark.sql.functions: Defines a Java UDF5 instance as user-defined function (UDF).
udf(UDF6<?, ?, ?, ?, ?, ?, ?>, DataType) - Static method in class org.apache.spark.sql.functions: Defines a Java UDF6 instance as user-defined function (UDF).
udf(UDF7<?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Static method in class org.apache.spark.sql.functions: Defines a Java UDF7 instance as user-defined function (UDF).
udf(UDF8<?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Static method in class org.apache.spark.sql.functions: Defines a Java UDF8 instance as user-defined function (UDF).
udf(UDF9<?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Static method in class org.apache.spark.sql.functions: Defines a Java UDF9 instance as user-defined function (UDF).
udf(UDF10<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Static method in class org.apache.spark.sql.functions: Defines a Java UDF10 instance as user-defined function (UDF).
udf(Object, DataType) - Static method in class org.apache.spark.sql.functions: Defines a deterministic user-defined function (UDF) using a Scala closure.
udf() - Method in class org.apache.spark.sql.SparkSession: A collection of methods for registering user-defined functions (UDF).
udf() - Method in class org.apache.spark.sql.SQLContext: A collection of methods for registering user-defined functions (UDF).
UDF0<R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 0 arguments.
UDF1<T1,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 1 arguments.
UDF10<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 10 arguments.
UDF11<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 11 arguments.
UDF12<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 12 arguments.
UDF13<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 13 arguments.
UDF14<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 14 arguments.
UDF15<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 15 arguments.
UDF16<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 16 arguments.
UDF17<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 17 arguments.
UDF18<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 18 arguments.
UDF19<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 19 arguments.
UDF2<T1,T2,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 2 arguments.
UDF20<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,T20,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 20 arguments.
UDF21<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,T20,T21,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 21 arguments.
UDF22<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,T20,T21,T22,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 22 arguments.
UDF3<T1,T2,T3,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 3 arguments.
UDF4<T1,T2,T3,T4,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 4 arguments.
UDF5<T1,T2,T3,T4,T5,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 5 arguments.
UDF6<T1,T2,T3,T4,T5,T6,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 6 arguments.
UDF7<T1,T2,T3,T4,T5,T6,T7,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 7 arguments.
UDF8<T1,T2,T3,T4,T5,T6,T7,T8,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 8 arguments.
UDF9<T1,T2,T3,T4,T5,T6,T7,T8,T9,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 9 arguments.
UDFRegistration - Class in org.apache.spark.sql: Functions for registering user-defined functions.
UDTRegistration - Class in org.apache.spark.sql.types: This object keeps the mappings between user classes and their User Defined Types (UDTs).
UDTRegistration() - Constructor for class org.apache.spark.sql.types.UDTRegistration
uid() - Method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
uid() - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
uid() - Method in class org.apache.spark.ml.classification.GBTClassificationModel
uid() - Method in class org.apache.spark.ml.classification.GBTClassifier
uid() - Method in class org.apache.spark.ml.classification.LinearSVC
uid() - Method in class org.apache.spark.ml.classification.LinearSVCModel
uid() - Method in class org.apache.spark.ml.classification.LogisticRegression
uid() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
uid() - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassificationModel
uid() - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier
uid() - Method in class org.apache.spark.ml.classification.NaiveBayes
uid() - Method in class org.apache.spark.ml.classification.NaiveBayesModel
uid() - Method in class org.apache.spark.ml.classification.OneVsRest
uid() - Method in class org.apache.spark.ml.classification.OneVsRestModel
uid() - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel
uid() - Method in class org.apache.spark.ml.classification.RandomForestClassifier
uid() - Method in class org.apache.spark.ml.clustering.BisectingKMeans
uid() - Method in class org.apache.spark.ml.clustering.BisectingKMeansModel
uid() - Method in class org.apache.spark.ml.clustering.GaussianMixture
uid() - Method in class org.apache.spark.ml.clustering.GaussianMixtureModel
uid() - Method in class org.apache.spark.ml.clustering.KMeans
uid() - Method in class org.apache.spark.ml.clustering.KMeansModel
uid() - Method in class org.apache.spark.ml.clustering.LDA
uid() - Method in class org.apache.spark.ml.clustering.LDAModel
uid() - Method in class org.apache.spark.ml.clustering.PowerIterationClustering
uid() - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
uid() - Method in class org.apache.spark.ml.evaluation.ClusteringEvaluator
uid() - Method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
uid() - Method in class org.apache.spark.ml.evaluation.RegressionEvaluator
uid() - Method in class org.apache.spark.ml.feature.Binarizer
uid() - Method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSH
uid() - Method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSHModel
uid() - Method in class org.apache.spark.ml.feature.Bucketizer
uid() - Method in class org.apache.spark.ml.feature.ChiSqSelector
uid() - Method in class org.apache.spark.ml.feature.ChiSqSelectorModel
uid() - Method in class org.apache.spark.ml.feature.ColumnPruner
uid() - Method in class org.apache.spark.ml.feature.CountVectorizer
uid() - Method in class org.apache.spark.ml.feature.CountVectorizerModel
uid() - Method in class org.apache.spark.ml.feature.DCT
uid() - Method in class org.apache.spark.ml.feature.ElementwiseProduct
uid() - Method in class org.apache.spark.ml.feature.FeatureHasher
uid() - Method in class org.apache.spark.ml.feature.HashingTF
uid() - Method in class org.apache.spark.ml.feature.IDF
uid() - Method in class org.apache.spark.ml.feature.IDFModel
uid() - Method in class org.apache.spark.ml.feature.Imputer
uid() - Method in class org.apache.spark.ml.feature.ImputerModel
uid() - Method in class org.apache.spark.ml.feature.IndexToString
uid() - Method in class org.apache.spark.ml.feature.Interaction
uid() - Method in class org.apache.spark.ml.feature.MaxAbsScaler
uid() - Method in class org.apache.spark.ml.feature.MaxAbsScalerModel
uid() - Method in class org.apache.spark.ml.feature.MinHashLSH
uid() - Method in class org.apache.spark.ml.feature.MinHashLSHModel
uid() - Method in class org.apache.spark.ml.feature.MinMaxScaler
uid() - Method in class org.apache.spark.ml.feature.MinMaxScalerModel
uid() - Method in class org.apache.spark.ml.feature.NGram
uid() - Method in class org.apache.spark.ml.feature.Normalizer
uid() - Method in class org.apache.spark.ml.feature.OneHotEncoder: Deprecated.
uid() - Method in class org.apache.spark.ml.feature.OneHotEncoderEstimator
uid() - Method in class org.apache.spark.ml.feature.OneHotEncoderModel
uid() - Method in class org.apache.spark.ml.feature.PCA
uid() - Method in class org.apache.spark.ml.feature.PCAModel
uid() - Method in class org.apache.spark.ml.feature.PolynomialExpansion
uid() - Method in class org.apache.spark.ml.feature.QuantileDiscretizer
uid() - Method in class org.apache.spark.ml.feature.RegexTokenizer
uid() - Method in class org.apache.spark.ml.feature.RFormula
uid() - Method in class org.apache.spark.ml.feature.RFormulaModel
uid() - Method in class org.apache.spark.ml.feature.SQLTransformer
uid() - Method in class org.apache.spark.ml.feature.StandardScaler
uid() - Method in class org.apache.spark.ml.feature.StandardScalerModel
uid() - Method in class org.apache.spark.ml.feature.StopWordsRemover
uid() - Method in class org.apache.spark.ml.feature.StringIndexer
uid() - Method in class org.apache.spark.ml.feature.StringIndexerModel
uid() - Method in class org.apache.spark.ml.feature.Tokenizer
uid() - Method in class org.apache.spark.ml.feature.VectorAssembler
uid() - Method in class org.apache.spark.ml.feature.VectorAttributeRewriter
uid() - Method in class org.apache.spark.ml.feature.VectorIndexer
uid() - Method in class org.apache.spark.ml.feature.VectorIndexerModel
uid() - Method in class org.apache.spark.ml.feature.VectorSizeHint
uid() - Method in class org.apache.spark.ml.feature.VectorSlicer
uid() - Method in class org.apache.spark.ml.feature.Word2Vec
uid() - Method in class org.apache.spark.ml.feature.Word2VecModel
uid() - Method in class org.apache.spark.ml.fpm.FPGrowth
uid() - Method in class org.apache.spark.ml.fpm.FPGrowthModel
uid() - Method in class org.apache.spark.ml.fpm.PrefixSpan
uid() - Method in class org.apache.spark.ml.Pipeline
uid() - Method in class org.apache.spark.ml.PipelineModel
uid() - Method in class org.apache.spark.ml.recommendation.ALS
uid() - Method in class org.apache.spark.ml.recommendation.ALSModel
uid() - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
uid() - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
uid() - Method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
uid() - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
uid() - Method in class org.apache.spark.ml.regression.GBTRegressionModel
uid() - Method in class org.apache.spark.ml.regression.GBTRegressor
uid() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression
uid() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionModel
uid() - Method in class org.apache.spark.ml.regression.IsotonicRegression
uid() - Method in class org.apache.spark.ml.regression.IsotonicRegressionModel
uid() - Method in class org.apache.spark.ml.regression.LinearRegression
uid() - Method in class org.apache.spark.ml.regression.LinearRegressionModel
uid() - Method in class org.apache.spark.ml.regression.RandomForestRegressionModel
uid() - Method in class org.apache.spark.ml.regression.RandomForestRegressor
uid() - Method in class org.apache.spark.ml.tuning.CrossValidator
uid() - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
uid() - Method in class org.apache.spark.ml.tuning.TrainValidationSplit
uid() - Method in class org.apache.spark.ml.tuning.TrainValidationSplitModel
uid() - Method in interface org.apache.spark.ml.util.Identifiable: An immutable unique ID for the object and its derivatives.
uiRoot() - Method in interface org.apache.spark.status.api.v1.ApiRequestContext
UIRoot - Interface in org.apache.spark.status.api.v1: This trait is shared by the all the root containers for application UI information -- the HistoryServer and the application UI.
uiRoot(HttpServletRequest) - Static method in class org.apache.spark.ui.UIUtils
UIRootFromServletContext - Class in org.apache.spark.status.api.v1
UIRootFromServletContext() - Constructor for class org.apache.spark.status.api.v1.UIRootFromServletContext
UIUtils - Class in org.apache.spark.streaming.ui
UIUtils() - Constructor for class org.apache.spark.streaming.ui.UIUtils
UIUtils - Class in org.apache.spark.ui: Utility functions for generating XML pages with spark content.
UIUtils() - Constructor for class org.apache.spark.ui.UIUtils
uiWebUrl() - Method in class org.apache.spark.SparkContext
UIWorkloadGenerator - Class in org.apache.spark.ui: Continuously generates jobs that expose various features of the WebUI (internal testing tool).
UIWorkloadGenerator() - Constructor for class org.apache.spark.ui.UIWorkloadGenerator
unapply(EdgeContext<VD, ED, A>) - Static method in class org.apache.spark.graphx.EdgeContext: Extractor mainly used for Graph#aggregateMessages*.
unapply(DenseVector) - Static method in class org.apache.spark.ml.linalg.DenseVector: Extracts the value array from a dense vector.
unapply(SparseVector) - Static method in class org.apache.spark.ml.linalg.SparseVector
unapply(DenseVector) - Static method in class org.apache.spark.mllib.linalg.DenseVector: Extracts the value array from a dense vector.
unapply(SparseVector) - Static method in class org.apache.spark.mllib.linalg.SparseVector
unapply(Column) - Static method in class org.apache.spark.sql.Column
unapply(Expression) - Method in class org.apache.spark.sql.types.DecimalType.Expression$
unapply(DecimalType) - Method in class org.apache.spark.sql.types.DecimalType.Fixed$
unapply(DataType) - Static method in class org.apache.spark.sql.types.DecimalType
unapply(Expression) - Static method in class org.apache.spark.sql.types.DecimalType
unapply(Expression) - Static method in class org.apache.spark.sql.types.NumericType: Enables matching against NumericType for expressions:
unapply(Throwable) - Static method in class org.apache.spark.util.CausedBy
unapply(String) - Static method in class org.apache.spark.util.IntParam
unapply(String) - Static method in class org.apache.spark.util.MemoryParam
UnaryTransformer<IN,OUT,T extends UnaryTransformer<IN,OUT,T>> - Class in org.apache.spark.ml: :: DeveloperApi :: Abstract class for transformers that take one input column, apply transformation, and output the result as a new column.
UnaryTransformer() - Constructor for class org.apache.spark.ml.UnaryTransformer
unbase64(Column) - Static method in class org.apache.spark.sql.functions: Decodes a BASE64 encoded string column and returns it as a binary column.
unboundedFollowing() - Static method in class org.apache.spark.sql.expressions.Window: Value representing the last row in the partition, equivalent to "UNBOUNDED FOLLOWING" in SQL.
unboundedFollowing() - Static method in class org.apache.spark.sql.functions: Deprecated.
Use Window.unboundedFollowing. Since 2.4.0.
unboundedPreceding() - Static method in class org.apache.spark.sql.expressions.Window: Value representing the first row in the partition, equivalent to "UNBOUNDED PRECEDING" in SQL.
unboundedPreceding() - Static method in class org.apache.spark.sql.functions: Deprecated.
Use Window.unboundedPreceding. Since 2.4.0.
unbroadcast(long, boolean, boolean) - Method in interface org.apache.spark.broadcast.BroadcastFactory
uncacheTable(String) - Method in class org.apache.spark.sql.catalog.Catalog: Removes the specified table from the in-memory cache.
uncacheTable(String) - Method in class org.apache.spark.sql.SQLContext: Removes the specified table from the in-memory cache.
UNCAUGHT_EXCEPTION() - Static method in class org.apache.spark.util.SparkExitCode: The default uncaught exception handler was reached.
UNCAUGHT_EXCEPTION_TWICE() - Static method in class org.apache.spark.util.SparkExitCode: The default uncaught exception handler was called and an exception was encountered while
undefinedImageType() - Static method in class org.apache.spark.ml.image.ImageSchema
underlyingSplit() - Method in class org.apache.spark.scheduler.SplitInfo
unhandledFilters(Filter[]) - Method in class org.apache.spark.sql.sources.BaseRelation: Returns the list of Filters that this datasource may not be able to handle.
unhex(Column) - Static method in class org.apache.spark.sql.functions: Inverse of hex.
UniformGenerator - Class in org.apache.spark.mllib.random: :: DeveloperApi :: Generates i.i.d.
UniformGenerator() - Constructor for class org.apache.spark.mllib.random.UniformGenerator
uniformJavaRDD(JavaSparkContext, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Java-friendly version of RandomRDDs.uniformRDD.
uniformJavaRDD(JavaSparkContext, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.uniformJavaRDD with the default seed.
uniformJavaRDD(JavaSparkContext, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.uniformJavaRDD with the default number of partitions and the default seed.
uniformJavaVectorRDD(JavaSparkContext, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Java-friendly version of RandomRDDs.uniformVectorRDD.
uniformJavaVectorRDD(JavaSparkContext, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.uniformJavaVectorRDD with the default seed.
uniformJavaVectorRDD(JavaSparkContext, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.uniformJavaVectorRDD with the default number of partitions and the default seed.
uniformRDD(SparkContext, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Generates an RDD comprised of i.i.d. samples from the uniform distribution U(0.0, 1.0).
uniformVectorRDD(SparkContext, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Generates an RDD[Vector] with vectors containing i.i.d. samples drawn from the uniform distribution on U(0.0, 1.0).
union(JavaDoubleRDD) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return the union of this RDD and another one.
union(JavaPairRDD<K, V>) - Method in class org.apache.spark.api.java.JavaPairRDD: Return the union of this RDD and another one.
union(JavaRDD<T>) - Method in class org.apache.spark.api.java.JavaRDD: Return the union of this RDD and another one.
union(JavaRDD<T>, List<JavaRDD<T>>) - Method in class org.apache.spark.api.java.JavaSparkContext: Build the union of two or more RDDs.
union(JavaPairRDD<K, V>, List<JavaPairRDD<K, V>>) - Method in class org.apache.spark.api.java.JavaSparkContext: Build the union of two or more RDDs.
union(JavaDoubleRDD, List<JavaDoubleRDD>) - Method in class org.apache.spark.api.java.JavaSparkContext: Build the union of two or more RDDs.
union(RDD<T>) - Method in class org.apache.spark.rdd.RDD: Return the union of this RDD and another one.
union(Seq<RDD<T>>, ClassTag<T>) - Method in class org.apache.spark.SparkContext: Build the union of a list of RDDs.
union(RDD<T>, Seq<RDD<T>>, ClassTag<T>) - Method in class org.apache.spark.SparkContext: Build the union of a list of RDDs passed as variable-length arguments.
union(Dataset<T>) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset containing union of rows in this Dataset and another Dataset.
union(JavaDStream<T>) - Method in class org.apache.spark.streaming.api.java.JavaDStream: Return a new DStream by unifying data of another DStream with this DStream.
union(JavaPairDStream<K, V>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by unifying data of another DStream with this DStream.
union(JavaDStream<T>, List<JavaDStream<T>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a unified DStream from multiple DStreams of the same type and same slide duration.
union(JavaPairDStream<K, V>, List<JavaPairDStream<K, V>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a unified DStream from multiple DStreams of the same type and same slide duration.
union(DStream<T>) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream by unifying data of another DStream with this DStream.
union(Seq<DStream<T>>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext: Create a unified DStream from multiple DStreams of the same type and same slide duration.
unionAll(Dataset<T>) - Method in class org.apache.spark.sql.Dataset: Deprecated.
use union(). Since 2.0.0.
unionByName(Dataset<T>) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset containing union of rows in this Dataset and another Dataset.
UnionRDD<T> - Class in org.apache.spark.rdd
UnionRDD(SparkContext, Seq<RDD<T>>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.UnionRDD
uniqueId() - Method in class org.apache.spark.storage.StreamBlockId
unix_timestamp() - Static method in class org.apache.spark.sql.functions: Returns the current Unix timestamp (in seconds) as a long.
unix_timestamp(Column) - Static method in class org.apache.spark.sql.functions: Converts time string in format yyyy-MM-dd HH:mm:ss to Unix timestamp (in seconds), using the default timezone and the default locale.
unix_timestamp(Column, String) - Static method in class org.apache.spark.sql.functions: Converts time string with given pattern to Unix timestamp (in seconds).
UnknownReason - Class in org.apache.spark: :: DeveloperApi :: We don't know why the task ended -- for example, because of a ClassNotFound exception when deserializing the task result.
UnknownReason() - Constructor for class org.apache.spark.UnknownReason
UNLIMITED_DECIMAL_PRECISION() - Static method in class org.apache.spark.sql.hive.HiveShim
UNLIMITED_DECIMAL_SCALE() - Static method in class org.apache.spark.sql.hive.HiveShim
unlink(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.CLogLog$
unlink(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Identity$
unlink(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Inverse$
unlink(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Log$
unlink(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Logit$
unlink(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Probit$
unlink(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Sqrt$
UNORDERED() - Static method in class org.apache.spark.rdd.DeterministicLevel
unpersist() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist(boolean) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist() - Method in class org.apache.spark.api.java.JavaPairRDD: Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist(boolean) - Method in class org.apache.spark.api.java.JavaPairRDD: Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist() - Method in class org.apache.spark.api.java.JavaRDD: Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist(boolean) - Method in class org.apache.spark.api.java.JavaRDD: Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist() - Method in class org.apache.spark.broadcast.Broadcast: Asynchronously delete cached copies of this broadcast on the executors.
unpersist(boolean) - Method in class org.apache.spark.broadcast.Broadcast: Delete cached copies of this broadcast on the executors.
unpersist(boolean) - Method in class org.apache.spark.graphx.Graph: Uncaches both vertices and edges of this graph.
unpersist(boolean) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
unpersist(boolean) - Method in class org.apache.spark.graphx.impl.GraphImpl
unpersist(boolean) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
unpersist() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Unpersist intermediate RDDs used in the computation.
unpersist(boolean) - Method in class org.apache.spark.rdd.RDD: Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist(boolean) - Method in class org.apache.spark.sql.Dataset: Mark the Dataset as non-persistent, and remove all blocks for it from memory and disk.
unpersist() - Method in class org.apache.spark.sql.Dataset: Mark the Dataset as non-persistent, and remove all blocks for it from memory and disk.
unpersistRDDFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
unpersistRDDToJson(SparkListenerUnpersistRDD) - Static method in class org.apache.spark.util.JsonProtocol
unpersistVertices(boolean) - Method in class org.apache.spark.graphx.Graph: Uncaches only the vertices of this graph, leaving the edges alone.
unpersistVertices(boolean) - Method in class org.apache.spark.graphx.impl.GraphImpl
UnrecognizedBlockId - Exception in org.apache.spark.storage
UnrecognizedBlockId(String) - Constructor for exception org.apache.spark.storage.UnrecognizedBlockId
unregister(QueryExecutionListener) - Method in class org.apache.spark.sql.util.ExecutionListenerManager: Unregisters the specified QueryExecutionListener.
unregisterDialect(JdbcDialect) - Static method in class org.apache.spark.sql.jdbc.JdbcDialects: Unregister a dialect.
Unresolved() - Static method in class org.apache.spark.ml.attribute.AttributeType: Unresolved type.
UnresolvedAttribute - Class in org.apache.spark.ml.attribute: :: DeveloperApi :: An unresolved attribute.
UnresolvedAttribute() - Constructor for class org.apache.spark.ml.attribute.UnresolvedAttribute
unset() - Static method in class org.apache.spark.rdd.InputFileBlockHolder: Clears the input file block to default value.
unset(String) - Method in class org.apache.spark.sql.RuntimeConfig: Resets the configuration property for the given key.
until(Time, Duration) - Method in class org.apache.spark.streaming.Time
unwrapOrcStructs(Configuration, StructType, StructType, Option<StructObjectInspector>, Iterator<Writable>) - Static method in class org.apache.spark.sql.hive.orc.OrcFileFormat
unwrapperFor(ObjectInspector) - Method in interface org.apache.spark.sql.hive.HiveInspectors: Builds unwrappers ahead of time according to object inspector types to avoid pattern matching and branching costs per row.
unwrapperFor(StructField) - Method in interface org.apache.spark.sql.hive.HiveInspectors: Builds unwrappers ahead of time according to object inspector types to avoid pattern matching and branching costs per row.
unwrapperFor(ObjectInspector) - Static method in class org.apache.spark.sql.hive.orc.OrcFileFormat
unwrapperFor(StructField) - Static method in class org.apache.spark.sql.hive.orc.OrcFileFormat
update(int, int, double) - Method in interface org.apache.spark.ml.linalg.Matrix: Update element at (i, j)
update(Function1<Object, Object>) - Method in interface org.apache.spark.ml.linalg.Matrix: Update all the values of this matrix using the function f.
update(RDD<Vector>, double, String) - Method in class org.apache.spark.mllib.clustering.StreamingKMeansModel: Perform a k-means update on a batch of data.
update(int, int, double) - Method in interface org.apache.spark.mllib.linalg.Matrix: Update element at (i, j)
update(Function1<Object, Object>) - Method in interface org.apache.spark.mllib.linalg.Matrix: Update all the values of this matrix using the function f.
update() - Method in class org.apache.spark.scheduler.AccumulableInfo
update(int, Object) - Method in class org.apache.spark.sql.expressions.MutableAggregationBuffer: Update the ith value of this buffer.
update(MutableAggregationBuffer, Row) - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction: Updates the given aggregation buffer buffer with new input data from input.
update(S) - Method in interface org.apache.spark.sql.streaming.GroupState: Update the value of the state.
Update() - Static method in class org.apache.spark.sql.streaming.OutputMode: OutputMode in which only the rows that were updated in the streaming DataFrame/Dataset will be written to the sink every time there are some updates.
update(int, Object) - Method in class org.apache.spark.sql.vectorized.ColumnarArray
update(int, Object) - Method in class org.apache.spark.sql.vectorized.ColumnarRow
update() - Method in class org.apache.spark.status.api.v1.AccumulableInfo
update(Seq<String>, String, long, long) - Method in class org.apache.spark.status.LiveRDDPartition
update(S) - Method in class org.apache.spark.streaming.State: Update the state with a new value.
update(T1, T2) - Method in class org.apache.spark.util.MutablePair: Updates this pair with new values and returns itself
UpdateBlockInfo(BlockManagerId, BlockId, StorageLevel, long, long) - Constructor for class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
UpdateBlockInfo() - Constructor for class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
UpdateBlockInfo$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo$
UPDATED_BLOCK_STATUSES() - Static method in class org.apache.spark.InternalAccumulator
UpdateDelegationTokens(byte[]) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.UpdateDelegationTokens
UpdateDelegationTokens$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.UpdateDelegationTokens$
updateMetrics(TaskMetrics) - Method in class org.apache.spark.status.LiveTask: Update the metrics for the task and return the difference between the previous and new values.
updatePrediction(Vector, double, DecisionTreeRegressionModel, double) - Static method in class org.apache.spark.ml.tree.impl.GradientBoostedTrees: Add prediction from a new boosting iteration to an existing prediction.
updatePredictionError(RDD<LabeledPoint>, RDD<Tuple2<Object, Object>>, double, DecisionTreeRegressionModel, Loss) - Static method in class org.apache.spark.ml.tree.impl.GradientBoostedTrees: Update a zipped predictionError RDD (as obtained with computeInitialPredictionAndError)
updatePredictionError(RDD<LabeledPoint>, RDD<Tuple2<Object, Object>>, double, DecisionTreeModel, Loss) - Static method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel: :: DeveloperApi :: Update a zipped predictionError RDD (as obtained with computeInitialPredictionAndError)
Updater - Class in org.apache.spark.mllib.optimization: :: DeveloperApi :: Class used to perform steps (weight update) using Gradient Descent methods.
Updater() - Constructor for class org.apache.spark.mllib.optimization.Updater
updateSparkConfigFromProperties(SparkConf, Map<String, String>) - Static method in class org.apache.spark.util.Utils: Updates Spark config with properties from a set of Properties.
updateStateByKey(Function2<List<V>, Optional<S>, Optional<S>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of each key.
updateStateByKey(Function2<List<V>, Optional<S>, Optional<S>>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of each key.
updateStateByKey(Function2<List<V>, Optional<S>, Optional<S>>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of the key.
updateStateByKey(Function2<List<V>, Optional<S>, Optional<S>>, Partitioner, JavaPairRDD<K, S>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of the key.
updateStateByKey(Function2<Seq<V>, Option<S>, Option<S>>, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of each key.
updateStateByKey(Function2<Seq<V>, Option<S>, Option<S>>, int, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of each key.
updateStateByKey(Function2<Seq<V>, Option<S>, Option<S>>, Partitioner, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of the key.
updateStateByKey(Function1<Iterator<Tuple3<K, Seq<V>, Option<S>>>, Iterator<Tuple2<K, S>>>, Partitioner, boolean, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of each key.
updateStateByKey(Function2<Seq<V>, Option<S>, Option<S>>, Partitioner, RDD<Tuple2<K, S>>, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of the key.
updateStateByKey(Function1<Iterator<Tuple3<K, Seq<V>, Option<S>>>, Iterator<Tuple2<K, S>>>, Partitioner, boolean, RDD<Tuple2<K, S>>, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of each key.
updateStateByKey(Function4<Time, K, Seq<V>, Option<S>, Option<S>>, Partitioner, boolean, Option<RDD<Tuple2<K, S>>>, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of the key.
upper(Column) - Static method in class org.apache.spark.sql.functions: Converts a string column to upper case.
upperBoundsOnCoefficients() - Method in interface org.apache.spark.ml.classification.LogisticRegressionParams: The upper bounds on coefficients if fitting under bound constrained optimization.
upperBoundsOnIntercepts() - Method in interface org.apache.spark.ml.classification.LogisticRegressionParams: The upper bounds on intercepts if fitting under bound constrained optimization.
useCommitCoordinator() - Method in interface org.apache.spark.sql.sources.v2.writer.DataSourceWriter: Returns whether Spark should use the commit coordinator to ensure that at most one task for each partition commits.
useDisk() - Method in class org.apache.spark.storage.StorageLevel
usedOffHeap() - Method in class org.apache.spark.status.LiveExecutor
usedOffHeapStorageMemory() - Method in interface org.apache.spark.SparkExecutorInfo
usedOffHeapStorageMemory() - Method in class org.apache.spark.SparkExecutorInfoImpl
usedOffHeapStorageMemory() - Method in class org.apache.spark.status.api.v1.MemoryMetrics
usedOnHeap() - Method in class org.apache.spark.status.LiveExecutor
usedOnHeapStorageMemory() - Method in interface org.apache.spark.SparkExecutorInfo
usedOnHeapStorageMemory() - Method in class org.apache.spark.SparkExecutorInfoImpl
usedOnHeapStorageMemory() - Method in class org.apache.spark.status.api.v1.MemoryMetrics
useDst - Variable in class org.apache.spark.graphx.TripletFields: Indicates whether the destination vertex attribute is included.
useEdge - Variable in class org.apache.spark.graphx.TripletFields: Indicates whether the edge attribute is included.
useMemory() - Method in class org.apache.spark.storage.StorageLevel
useNodeIdCache() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
useOffHeap() - Method in class org.apache.spark.storage.StorageLevel
user() - Method in class org.apache.spark.ml.recommendation.ALS.Rating
user() - Method in class org.apache.spark.mllib.recommendation.Rating
USER_DEFAULT() - Static method in class org.apache.spark.sql.types.DecimalType
userClass() - Method in class org.apache.spark.mllib.linalg.VectorUDT
userCol() - Method in interface org.apache.spark.ml.recommendation.ALSModelParams: Param for the column name for user ids.
UserDefinedAggregateFunction - Class in org.apache.spark.sql.expressions: The base class for implementing user-defined aggregate functions (UDAF).
UserDefinedAggregateFunction() - Constructor for class org.apache.spark.sql.expressions.UserDefinedAggregateFunction
UserDefinedFunction - Class in org.apache.spark.sql.expressions: A user-defined function.
userFactors() - Method in class org.apache.spark.ml.recommendation.ALSModel
userFeatures() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
userPort(int, int) - Static method in class org.apache.spark.util.Utils: Returns the user port to try when trying to bind a service.
useSrc - Variable in class org.apache.spark.graphx.TripletFields: Indicates whether the source vertex attribute is included.
usingBoundConstrainedOptimization() - Method in interface org.apache.spark.ml.classification.LogisticRegressionParams
Utils - Class in org.apache.spark.ml.impl
Utils() - Constructor for class org.apache.spark.ml.impl.Utils
Utils - Class in org.apache.spark.util: Various utility methods used by Spark.
Utils() - Constructor for class org.apache.spark.util.Utils
UUIDFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
UUIDToJson(UUID) - Static method in class org.apache.spark.util.JsonProtocol

V

V() - Method in class org.apache.spark.mllib.linalg.SingularValueDecomposition
validate() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix: Validates the block matrix info against the matrix data (blocks) and throws an exception if any error is found.
validateAndTransformSchema(StructType, boolean, DataType) - Method in interface org.apache.spark.ml.classification.ClassifierParams
validateAndTransformSchema(StructType, boolean, DataType) - Method in interface org.apache.spark.ml.classification.LogisticRegressionParams
validateAndTransformSchema(StructType, boolean, DataType) - Method in interface org.apache.spark.ml.classification.ProbabilisticClassifierParams
validateAndTransformSchema(StructType) - Method in interface org.apache.spark.ml.clustering.BisectingKMeansParams: Validates and transforms the input schema.
validateAndTransformSchema(StructType) - Method in interface org.apache.spark.ml.clustering.GaussianMixtureParams: Validates and transforms the input schema.
validateAndTransformSchema(StructType) - Method in interface org.apache.spark.ml.clustering.KMeansParams: Validates and transforms the input schema.
validateAndTransformSchema(StructType) - Method in interface org.apache.spark.ml.clustering.LDAParams: Validates and transforms the input schema.
validateAndTransformSchema(StructType) - Method in interface org.apache.spark.ml.feature.CountVectorizerParams: Validates and transforms the input schema.
validateAndTransformSchema(StructType) - Method in interface org.apache.spark.ml.feature.IDFBase: Validate and transform the input schema.
validateAndTransformSchema(StructType) - Method in interface org.apache.spark.ml.feature.ImputerParams: Validates and transforms the input schema.
validateAndTransformSchema(StructType) - Method in interface org.apache.spark.ml.feature.LSHParams: Transform the Schema for LSH
validateAndTransformSchema(StructType) - Method in interface org.apache.spark.ml.feature.MaxAbsScalerParams: Validates and transforms the input schema.
validateAndTransformSchema(StructType) - Method in interface org.apache.spark.ml.feature.MinMaxScalerParams: Validates and transforms the input schema.
validateAndTransformSchema(StructType, boolean, boolean) - Method in interface org.apache.spark.ml.feature.OneHotEncoderBase
validateAndTransformSchema(StructType) - Method in interface org.apache.spark.ml.feature.PCAParams: Validates and transforms the input schema.
validateAndTransformSchema(StructType) - Method in interface org.apache.spark.ml.feature.StandardScalerParams: Validates and transforms the input schema.
validateAndTransformSchema(StructType) - Method in interface org.apache.spark.ml.feature.StringIndexerBase: Validates and transforms the input schema.
validateAndTransformSchema(StructType) - Method in interface org.apache.spark.ml.feature.Word2VecBase: Validate and transform the input schema.
validateAndTransformSchema(StructType) - Method in interface org.apache.spark.ml.fpm.FPGrowthParams: Validates and transforms the input schema.
validateAndTransformSchema(StructType, boolean, DataType) - Method in interface org.apache.spark.ml.PredictorParams: Validates and transforms the input schema with the provided param map.
validateAndTransformSchema(StructType) - Method in interface org.apache.spark.ml.recommendation.ALSParams: Validates and transforms the input schema.
validateAndTransformSchema(StructType, boolean) - Method in interface org.apache.spark.ml.regression.AFTSurvivalRegressionParams: Validates and transforms the input schema with the provided param map.
validateAndTransformSchema(StructType, boolean, DataType) - Method in interface org.apache.spark.ml.regression.GeneralizedLinearRegressionBase
validateAndTransformSchema(StructType, boolean) - Method in interface org.apache.spark.ml.regression.IsotonicRegressionBase: Validates and transforms input schema.
validateAndTransformSchema(StructType, boolean, DataType) - Method in interface org.apache.spark.ml.regression.LinearRegressionParams
validateAndTransformSchema(StructType, boolean, DataType) - Method in interface org.apache.spark.ml.tree.DecisionTreeRegressorParams
validateDirectoryUri(String) - Method in interface org.apache.spark.rpc.RpcEnvFileServer: Validates and normalizes the base URI for directories.
validateStages(PipelineStage[]) - Method in class org.apache.spark.ml.Pipeline.SharedReadWrite$: Check that all stages are Writable
validateURL(URI) - Static method in class org.apache.spark.util.Utils: Validate that a given URI is actually a valid URL as well.
validateVectorCompatibleColumn(StructType, String) - Static method in class org.apache.spark.ml.util.SchemaUtils: Check whether the given column in the schema is one of the supporting vector type: Vector, Array[Float].
validationIndicatorCol() - Method in interface org.apache.spark.ml.param.shared.HasValidationIndicatorCol: Param for name of the column that indicates whether each row is for training or for validation.
validationMetrics() - Method in class org.apache.spark.ml.tuning.TrainValidationSplitModel
validationTol() - Method in interface org.apache.spark.ml.tree.GBTParams: Threshold for stopping early when fit with validation is used.
validationTol() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
ValidatorParams - Interface in org.apache.spark.ml.tuning: Common params for TrainValidationSplitParams and CrossValidatorParams.
value() - Method in class org.apache.spark.Accumulable: Deprecated.

Access the accumulator's current value; only allowed on driver.
value() - Method in class org.apache.spark.broadcast.Broadcast: Get the broadcasted value.
value() - Method in class org.apache.spark.ComplexFutureAction
value() - Method in interface org.apache.spark.FutureAction: The value of this Future.
value() - Method in class org.apache.spark.ml.param.ParamPair
value() - Method in class org.apache.spark.mllib.linalg.distributed.MatrixEntry
value() - Method in class org.apache.spark.mllib.stat.test.BinarySample
value() - Method in class org.apache.spark.scheduler.AccumulableInfo
value() - Method in class org.apache.spark.SerializableWritable
value() - Method in class org.apache.spark.SimpleFutureAction
value() - Method in class org.apache.spark.sql.sources.EqualNullSafe
value() - Method in class org.apache.spark.sql.sources.EqualTo
value() - Method in class org.apache.spark.sql.sources.GreaterThan
value() - Method in class org.apache.spark.sql.sources.GreaterThanOrEqual
value() - Method in class org.apache.spark.sql.sources.LessThan
value() - Method in class org.apache.spark.sql.sources.LessThanOrEqual
value() - Method in class org.apache.spark.sql.sources.StringContains
value() - Method in class org.apache.spark.sql.sources.StringEndsWith
value() - Method in class org.apache.spark.sql.sources.StringStartsWith
value() - Method in class org.apache.spark.status.api.v1.AccumulableInfo
value() - Method in class org.apache.spark.status.LiveRDDPartition
value() - Method in class org.apache.spark.storage.memory.DeserializedMemoryEntry
value() - Method in class org.apache.spark.util.AccumulatorV2: Defines the current value of this accumulator
value() - Method in class org.apache.spark.util.CollectionAccumulator
value() - Method in class org.apache.spark.util.DoubleAccumulator
value() - Method in class org.apache.spark.util.LegacyAccumulatorWrapper
value() - Method in class org.apache.spark.util.LongAccumulator
valueArray() - Method in class org.apache.spark.sql.vectorized.ColumnarMap
valueContainsNull() - Method in class org.apache.spark.sql.types.MapType
valueOf(String) - Static method in enum org.apache.spark.graphx.impl.EdgeActiveness: Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.spark.JobExecutionStatus: Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.spark.launcher.SparkAppHandle.State: Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.spark.sql.SaveMode: Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.spark.status.api.v1.ApplicationStatus: Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.spark.status.api.v1.StageStatus: Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.spark.status.api.v1.streaming.BatchStatus: Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.spark.status.api.v1.TaskSorting: Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.spark.streaming.StreamingContextState: Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.spark.util.sketch.BloomFilter.Version: Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.spark.util.sketch.CountMinSketch.Version: Returns the enum constant of this type with the specified name.
values() - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD with the values of each tuple.
values() - Static method in enum org.apache.spark.graphx.impl.EdgeActiveness: Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.spark.JobExecutionStatus: Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.spark.launcher.SparkAppHandle.State: Returns an array containing the constants of this enum type, in the order they are declared.
VALUES() - Static method in class org.apache.spark.ml.attribute.AttributeKeys
values() - Method in class org.apache.spark.ml.attribute.BinaryAttribute
values() - Method in class org.apache.spark.ml.attribute.NominalAttribute
values() - Method in class org.apache.spark.ml.linalg.DenseMatrix
values() - Method in class org.apache.spark.ml.linalg.DenseVector
values() - Method in class org.apache.spark.ml.linalg.SparseMatrix
values() - Method in class org.apache.spark.ml.linalg.SparseVector
values() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
values() - Method in class org.apache.spark.mllib.linalg.DenseVector
values() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
values() - Method in class org.apache.spark.mllib.linalg.SparseVector
values() - Static method in class org.apache.spark.mllib.tree.configuration.Algo
values() - Static method in class org.apache.spark.mllib.tree.configuration.EnsembleCombiningStrategy
values() - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
values() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
values() - Static method in class org.apache.spark.rdd.CheckpointState
values() - Static method in class org.apache.spark.rdd.DeterministicLevel
values() - Method in class org.apache.spark.rdd.PairRDDFunctions: Return an RDD with the values of each tuple.
values() - Static method in class org.apache.spark.scheduler.SchedulingMode
values() - Static method in class org.apache.spark.scheduler.TaskLocality
values() - Static method in enum org.apache.spark.sql.SaveMode: Returns an array containing the constants of this enum type, in the order they are declared.
values() - Method in class org.apache.spark.sql.sources.In
values() - Static method in enum org.apache.spark.status.api.v1.ApplicationStatus: Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.spark.status.api.v1.StageStatus: Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.spark.status.api.v1.streaming.BatchStatus: Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.spark.status.api.v1.TaskSorting: Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in class org.apache.spark.streaming.scheduler.ReceiverState
values() - Static method in enum org.apache.spark.streaming.StreamingContextState: Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in class org.apache.spark.TaskState
values() - Static method in enum org.apache.spark.util.sketch.BloomFilter.Version: Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.spark.util.sketch.CountMinSketch.Version: Returns an array containing the constants of this enum type, in the order they are declared.
ValuesHolder<T> - Interface in org.apache.spark.storage.memory
valueType() - Method in class org.apache.spark.sql.types.MapType
var_pop(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the population variance of the values in a group.
var_pop(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the population variance of the values in a group.
var_samp(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the unbiased variance of the values in a group.
var_samp(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the unbiased variance of the values in a group.
VarcharType - Class in org.apache.spark.sql.types: Hive varchar type.
VarcharType(int) - Constructor for class org.apache.spark.sql.types.VarcharType
variance() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Compute the population variance of this RDD's elements.
variance(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Binomial$
variance(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Gamma$
variance(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Gaussian$
variance(double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Poisson$
variance(Column, Column) - Static method in class org.apache.spark.ml.stat.Summarizer
variance(Column) - Static method in class org.apache.spark.ml.stat.Summarizer
variance() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer: Unbiased estimate of sample variance of each dimension.
variance() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary: Sample variance vector.
Variance - Class in org.apache.spark.mllib.tree.impurity: Class for calculating variance during regression
Variance() - Constructor for class org.apache.spark.mllib.tree.impurity.Variance
variance() - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Compute the population variance of this RDD's elements.
variance(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: alias for var_samp.
variance(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: alias for var_samp.
variance() - Method in class org.apache.spark.util.StatCounter: Return the population variance of the values.
varianceCol() - Method in interface org.apache.spark.ml.param.shared.HasVarianceCol: Param for Column name for the biased sample variance of prediction.
variancePower() - Method in interface org.apache.spark.ml.regression.GeneralizedLinearRegressionBase: Param for the power in the variance function of the Tweedie distribution which provides the relationship between the variance and mean of the distribution.
vClassTag() - Method in class org.apache.spark.api.java.JavaHadoopRDD
vClassTag() - Method in class org.apache.spark.api.java.JavaNewHadoopRDD
vClassTag() - Method in class org.apache.spark.api.java.JavaPairRDD
vClassTag() - Method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
vClassTag() - Method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
Vector - Interface in org.apache.spark.ml.linalg: Represents a numeric vector, whose index type is Int and value type is Double.
vector() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRow
Vector - Interface in org.apache.spark.mllib.linalg: Represents a numeric vector, whose index type is Int and value type is Double.
vector() - Method in class org.apache.spark.storage.memory.DeserializedValuesHolder
VectorAssembler - Class in org.apache.spark.ml.feature: A feature transformer that merges multiple columns into a vector column.
VectorAssembler(String) - Constructor for class org.apache.spark.ml.feature.VectorAssembler
VectorAssembler() - Constructor for class org.apache.spark.ml.feature.VectorAssembler
VectorAttributeRewriter - Class in org.apache.spark.ml.feature: Utility transformer that rewrites Vector attribute names via prefix replacement.
VectorAttributeRewriter(String, String, Map<String, String>) - Constructor for class org.apache.spark.ml.feature.VectorAttributeRewriter
VectorAttributeRewriter(String, Map<String, String>) - Constructor for class org.apache.spark.ml.feature.VectorAttributeRewriter
vectorCol() - Method in class org.apache.spark.ml.feature.VectorAttributeRewriter
VectorImplicits - Class in org.apache.spark.mllib.linalg: Implicit methods available in Scala for converting Vector to Vector and vice versa.
VectorImplicits() - Constructor for class org.apache.spark.mllib.linalg.VectorImplicits
VectorIndexer - Class in org.apache.spark.ml.feature: Class for indexing categorical feature columns in a dataset of Vector.
VectorIndexer(String) - Constructor for class org.apache.spark.ml.feature.VectorIndexer
VectorIndexer() - Constructor for class org.apache.spark.ml.feature.VectorIndexer
VectorIndexerModel - Class in org.apache.spark.ml.feature: Model fitted by VectorIndexer.
VectorIndexerParams - Interface in org.apache.spark.ml.feature: Private trait for params for VectorIndexer and VectorIndexerModel
Vectors - Class in org.apache.spark.ml.linalg: Factory methods for Vector.
Vectors() - Constructor for class org.apache.spark.ml.linalg.Vectors
Vectors - Class in org.apache.spark.mllib.linalg: Factory methods for Vector.
Vectors() - Constructor for class org.apache.spark.mllib.linalg.Vectors
vectorSize() - Method in interface org.apache.spark.ml.feature.Word2VecBase: The dimension of the code that you want to transform from words.
VectorSizeHint - Class in org.apache.spark.ml.feature: :: Experimental :: A feature transformer that adds size information to the metadata of a vector column.
VectorSizeHint(String) - Constructor for class org.apache.spark.ml.feature.VectorSizeHint
VectorSizeHint() - Constructor for class org.apache.spark.ml.feature.VectorSizeHint
VectorSlicer - Class in org.apache.spark.ml.feature: This class takes a feature vector and outputs a new feature vector with a subarray of the original features.
VectorSlicer(String) - Constructor for class org.apache.spark.ml.feature.VectorSlicer
VectorSlicer() - Constructor for class org.apache.spark.ml.feature.VectorSlicer
VectorTransformer - Interface in org.apache.spark.mllib.feature: :: DeveloperApi :: Trait for transformation of a vector
VectorType() - Static method in class org.apache.spark.ml.linalg.SQLDataTypes: Data type for Vector.
VectorUDT - Class in org.apache.spark.mllib.linalg: :: AlphaComponent ::
VectorUDT() - Constructor for class org.apache.spark.mllib.linalg.VectorUDT
version() - Method in class org.apache.spark.api.java.JavaSparkContext: The version of Spark on which this application is running.
version() - Method in class org.apache.spark.io.SnappyCompressionCodec
version() - Method in class org.apache.spark.SparkContext: The version of Spark on which this application is running.
version() - Method in interface org.apache.spark.sql.hive.client.HiveClient: Returns the Hive Version of this client.
version() - Method in class org.apache.spark.sql.SparkSession: The version of Spark on which this application is running.
VersionInfo - Class in org.apache.spark.status.api.v1
VersionUtils - Class in org.apache.spark.util: Utilities for working with Spark version strings
VersionUtils() - Constructor for class org.apache.spark.util.VersionUtils
vertcat(Matrix[]) - Static method in class org.apache.spark.ml.linalg.Matrices: Vertically concatenate a sequence of matrices.
vertcat(Matrix[]) - Static method in class org.apache.spark.mllib.linalg.Matrices: Vertically concatenate a sequence of matrices.
vertexAttr(long) - Method in class org.apache.spark.graphx.EdgeTriplet: Get the vertex object for the given vertex in the edge.
VertexPartitionBaseOpsConstructor<T extends org.apache.spark.graphx.impl.VertexPartitionBase<Object>> - Interface in org.apache.spark.graphx.impl: A typeclass for subclasses of VertexPartitionBase representing the ability to wrap them in a VertexPartitionBaseOps.
VertexRDD<VD> - Class in org.apache.spark.graphx: Extends RDD[(VertexId, VD)] by ensuring that there is only one entry for each vertex and by pre-indexing the entries for fast, efficient joins.
VertexRDD(SparkContext, Seq<Dependency<?>>) - Constructor for class org.apache.spark.graphx.VertexRDD
VertexRDDImpl<VD> - Class in org.apache.spark.graphx.impl
vertices() - Method in class org.apache.spark.graphx.Graph: An RDD containing the vertices and their associated attributes.
vertices() - Method in class org.apache.spark.graphx.impl.GraphImpl
viewToSeq(KVStoreView<T>, int, Function1<T, Object>) - Static method in class org.apache.spark.status.KVUtils: Turns a KVStoreView into a Scala sequence, applying a filter.
visit(int, int, String, String, String, String[]) - Method in class org.apache.spark.util.InnerClosureFinder
visitMethod(int, String, String, String, String[]) - Method in class org.apache.spark.util.InnerClosureFinder
visitMethod(int, String, String, String, String[]) - Method in class org.apache.spark.util.ReturnStatementFinder
vizHeaderNodes(HttpServletRequest) - Static method in class org.apache.spark.ui.UIUtils
vManifest() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
vocabSize() - Method in class org.apache.spark.ml.clustering.LDAModel
vocabSize() - Method in interface org.apache.spark.ml.feature.CountVectorizerParams: Max size of the vocabulary.
vocabSize() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
vocabSize() - Method in class org.apache.spark.mllib.clustering.LDAModel: Vocabulary size (number of terms or terms in the vocabulary)
vocabSize() - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
vocabulary() - Method in class org.apache.spark.ml.feature.CountVectorizerModel
VocabWord - Class in org.apache.spark.mllib.feature: Entry in vocabulary
VocabWord(String, long, int[], int[], int) - Constructor for class org.apache.spark.mllib.feature.VocabWord
VoidFunction<T> - Interface in org.apache.spark.api.java.function: A function with no return value.
VoidFunction2<T1,T2> - Interface in org.apache.spark.api.java.function: A two-argument function that takes arguments of type T1 and T2 with no return value.
Vote() - Static method in class org.apache.spark.mllib.tree.configuration.EnsembleCombiningStrategy

W

w(boolean) - Method in class org.apache.spark.ml.param.BooleanParam: Creates a param pair with the given value (for Java).
w(List<List<Double>>) - Method in class org.apache.spark.ml.param.DoubleArrayArrayParam: Creates a param pair with a `java.util.List` of values (for Java and Python).
w(List<Double>) - Method in class org.apache.spark.ml.param.DoubleArrayParam: Creates a param pair with a `java.util.List` of values (for Java and Python).
w(double) - Method in class org.apache.spark.ml.param.DoubleParam: Creates a param pair with the given value (for Java).
w(float) - Method in class org.apache.spark.ml.param.FloatParam: Creates a param pair with the given value (for Java).
w(List<Integer>) - Method in class org.apache.spark.ml.param.IntArrayParam: Creates a param pair with a `java.util.List` of values (for Java and Python).
w(int) - Method in class org.apache.spark.ml.param.IntParam: Creates a param pair with the given value (for Java).
w(long) - Method in class org.apache.spark.ml.param.LongParam: Creates a param pair with the given value (for Java).
w(T) - Method in class org.apache.spark.ml.param.Param: Creates a param pair with the given value (for Java).
w(List<String>) - Method in class org.apache.spark.ml.param.StringArrayParam: Creates a param pair with a `java.util.List` of values (for Java and Python).
waitTillTime(long) - Method in interface org.apache.spark.util.Clock
waitUntilEmpty(long) - Method in class org.apache.spark.scheduler.AsyncEventQueue: For testing only.
warmUp(SparkContext) - Static method in class org.apache.spark.streaming.util.RawTextHelper: Warms up the SparkContext in master and slave by running tasks to force JIT kick in before real workload starts.
weakIntern(String) - Static method in class org.apache.spark.status.LiveEntityHelpers: String interning to reduce the memory usage.
weekofyear(Column) - Static method in class org.apache.spark.sql.functions: Extracts the week number as an integer from a given date/timestamp/string.
WeibullGenerator - Class in org.apache.spark.mllib.random: :: DeveloperApi :: Generates i.i.d.
WeibullGenerator(double, double) - Constructor for class org.apache.spark.mllib.random.WeibullGenerator
weight() - Method in interface org.apache.spark.ml.optim.aggregator.DifferentiableLossAggregator: Weighted count of instances in this aggregator.
weight() - Method in interface org.apache.spark.scheduler.Schedulable
weightCol() - Method in interface org.apache.spark.ml.param.shared.HasWeightCol: Param for weight column name.
weightedFalsePositiveRate() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary: Returns weighted false positive rate.
weightedFalsePositiveRate() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns weighted false positive rate
weightedFMeasure(double) - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary: Returns weighted averaged f-measure.
weightedFMeasure() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary: Returns weighted averaged f1-measure.
weightedFMeasure(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns weighted averaged f-measure
weightedFMeasure() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns weighted averaged f1-measure
weightedPrecision() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary: Returns weighted averaged precision.
weightedPrecision() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns weighted averaged precision
weightedRecall() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary: Returns weighted averaged recall.
weightedRecall() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns weighted averaged recall (equals to precision, recall and f-measure)
weightedTruePositiveRate() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary: Returns weighted true positive rate.
weightedTruePositiveRate() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns weighted true positive rate (equals to precision, recall and f-measure)
weights() - Method in interface org.apache.spark.ml.ann.LayerModel
weights() - Method in interface org.apache.spark.ml.ann.TopologyModel
weights() - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassificationModel
weights() - Method in class org.apache.spark.ml.clustering.ExpectationAggregator
weights() - Method in class org.apache.spark.ml.clustering.GaussianMixtureModel
weights() - Method in class org.apache.spark.mllib.classification.impl.GLMClassificationModel.SaveLoadV1_0$.Data
weights() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
weights() - Method in class org.apache.spark.mllib.classification.SVMModel
weights() - Method in class org.apache.spark.mllib.clustering.ExpectationSum
weights() - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
weights() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
weights() - Method in class org.apache.spark.mllib.regression.impl.GLMRegressionModel.SaveLoadV1_0$.Data
weights() - Method in class org.apache.spark.mllib.regression.LassoModel
weights() - Method in class org.apache.spark.mllib.regression.LinearRegressionModel
weights() - Method in class org.apache.spark.mllib.regression.RidgeRegressionModel
weightSize() - Method in interface org.apache.spark.ml.ann.Layer: Number of weights that is used to allocate memory for the weights vector
weightSum() - Method in interface org.apache.spark.ml.optim.aggregator.DifferentiableLossAggregator
WelchTTest - Class in org.apache.spark.mllib.stat.test: Performs Welch's 2-sample t-test.
WelchTTest() - Constructor for class org.apache.spark.mllib.stat.test.WelchTTest
when(Column, Object) - Method in class org.apache.spark.sql.Column: Evaluates a list of conditions and returns one of multiple possible result expressions.
when(Column, Object) - Static method in class org.apache.spark.sql.functions: Evaluates a list of conditions and returns one of multiple possible result expressions.
where(Column) - Method in class org.apache.spark.sql.Dataset: Filters rows using the given condition.
where(String) - Method in class org.apache.spark.sql.Dataset: Filters rows using the given SQL expression.
wholeTextFiles(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Read a directory of text files from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI.
wholeTextFiles(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Read a directory of text files from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI.
wholeTextFiles(String, int) - Method in class org.apache.spark.SparkContext: Read a directory of text files from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI.
width() - Method in class org.apache.spark.util.sketch.CountMinSketch: Width of this CountMinSketch.
Window - Class in org.apache.spark.sql.expressions: Utility functions for defining window in DataFrames.
Window() - Constructor for class org.apache.spark.sql.expressions.Window
window(Column, String, String, String) - Static method in class org.apache.spark.sql.functions: Bucketize rows into one or more time windows given a timestamp specifying column.
window(Column, String, String) - Static method in class org.apache.spark.sql.functions: Bucketize rows into one or more time windows given a timestamp specifying column.
window(Column, String) - Static method in class org.apache.spark.sql.functions: Generates tumbling time windows given a timestamp specifying column.
window(Duration) - Method in class org.apache.spark.streaming.api.java.JavaDStream: Return a new DStream in which each RDD contains all the elements in seen in a sliding window of time over this DStream.
window(Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaDStream: Return a new DStream in which each RDD contains all the elements in seen in a sliding window of time over this DStream.
window(Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream which is computed based on windowed batches of this DStream.
window(Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream which is computed based on windowed batches of this DStream.
window(Duration) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD contains all the elements in seen in a sliding window of time over this DStream.
window(Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD contains all the elements in seen in a sliding window of time over this DStream.
windowsDrive() - Static method in class org.apache.spark.util.Utils: Pattern for matching a Windows drive, which contains only a single alphabet character.
windowSize() - Method in interface org.apache.spark.ml.feature.Word2VecBase: The window size (context words from [-window, window]).
WindowSpec - Class in org.apache.spark.sql.expressions: A window specification that defines the partitioning, ordering, and frame boundaries.
wipe() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace
withColumn(String, Column) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset by adding a column or replacing the existing column that has the same name.
withColumnRenamed(String, String) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset with a column renamed.
withComment(String) - Method in class org.apache.spark.sql.types.StructField: Updates the StructField with a new comment value.
withContextClassLoader(ClassLoader, Function0<T>) - Static method in class org.apache.spark.util.Utils: Run a segment of code using a different context class loader in the current thread
withDummyCallSite(SparkContext, Function0<T>) - Static method in class org.apache.spark.util.Utils: To avoid calling Utils.getCallSite for every single RDD we create in the body, set a dummy call site that RDDs use instead.
withEdges(EdgeRDD<?>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
withEdges(EdgeRDD<?>) - Method in class org.apache.spark.graphx.VertexRDD: Prepares this VertexRDD for efficient joins with the given EdgeRDD.
withExtensions(Function1<SparkSessionExtensions, BoxedUnit>) - Method in class org.apache.spark.sql.SparkSession.Builder: Inject extensions into the SparkSession.
withHiveExternalCatalog(SparkContext) - Static method in class org.apache.spark.sql.hive.HiveUtils
withHiveState(Function0<A>) - Method in interface org.apache.spark.sql.hive.client.HiveClient: Run a function within Hive state (SessionState, HiveConf, Hive client and class loader)
withIndex(int) - Method in class org.apache.spark.ml.attribute.Attribute: Copy with a new index.
withIndex(int) - Method in class org.apache.spark.ml.attribute.BinaryAttribute
withIndex(int) - Method in class org.apache.spark.ml.attribute.NominalAttribute
withIndex(int) - Method in class org.apache.spark.ml.attribute.NumericAttribute
withIndex(int) - Static method in class org.apache.spark.ml.attribute.UnresolvedAttribute
withListener(Function1<org.apache.spark.streaming.ui.StreamingJobProgressListener, T>) - Method in interface org.apache.spark.status.api.v1.streaming.BaseStreamingAppResource
withListener(SparkContext, L, Function1<L, BoxedUnit>) - Static method in class org.apache.spark.TestUtils: Runs some code with the given listener installed in the SparkContext.
withMapStatuses(Function1<MapStatus[], T>) - Method in class org.apache.spark.ShuffleStatus: Helper function which provides thread-safe access to the mapStatuses array.
withMax(double) - Method in class org.apache.spark.ml.attribute.NumericAttribute: Copy with a new max value.
withMean() - Method in interface org.apache.spark.ml.feature.StandardScalerParams: Whether to center the data with mean before scaling.
withMean() - Method in class org.apache.spark.mllib.feature.StandardScalerModel
withMetadata(Metadata) - Method in class org.apache.spark.sql.types.MetadataBuilder: Include the content of an existing Metadata instance.
withMin(double) - Method in class org.apache.spark.ml.attribute.NumericAttribute: Copy with a new min value.
withName(String) - Method in class org.apache.spark.ml.attribute.Attribute: Copy with a new name.
withName(String) - Method in class org.apache.spark.ml.attribute.BinaryAttribute
withName(String) - Method in class org.apache.spark.ml.attribute.NominalAttribute
withName(String) - Method in class org.apache.spark.ml.attribute.NumericAttribute
withName(String) - Static method in class org.apache.spark.ml.attribute.UnresolvedAttribute
withName(String) - Static method in class org.apache.spark.mllib.tree.configuration.Algo
withName(String) - Static method in class org.apache.spark.mllib.tree.configuration.EnsembleCombiningStrategy
withName(String) - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
withName(String) - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
withName(String) - Static method in class org.apache.spark.rdd.CheckpointState
withName(String) - Static method in class org.apache.spark.rdd.DeterministicLevel
withName(String) - Static method in class org.apache.spark.scheduler.SchedulingMode
withName(String) - Static method in class org.apache.spark.scheduler.TaskLocality
withName(String) - Method in class org.apache.spark.sql.expressions.UserDefinedFunction: Updates UserDefinedFunction with a given name.
withName(String) - Static method in class org.apache.spark.streaming.scheduler.ReceiverState
withName(String) - Static method in class org.apache.spark.TaskState
withNullSafe(Function1<Object, Object>) - Method in interface org.apache.spark.sql.hive.HiveInspectors
withNumValues(int) - Method in class org.apache.spark.ml.attribute.NominalAttribute: Copy with a new numValues and empty values.
withoutIndex() - Method in class org.apache.spark.ml.attribute.Attribute: Copy without the index.
withoutIndex() - Method in class org.apache.spark.ml.attribute.BinaryAttribute
withoutIndex() - Method in class org.apache.spark.ml.attribute.NominalAttribute
withoutIndex() - Method in class org.apache.spark.ml.attribute.NumericAttribute
withoutIndex() - Static method in class org.apache.spark.ml.attribute.UnresolvedAttribute
withoutMax() - Method in class org.apache.spark.ml.attribute.NumericAttribute: Copy without the max value.
withoutMin() - Method in class org.apache.spark.ml.attribute.NumericAttribute: Copy without the min value.
withoutName() - Method in class org.apache.spark.ml.attribute.Attribute: Copy without the name.
withoutName() - Method in class org.apache.spark.ml.attribute.BinaryAttribute
withoutName() - Method in class org.apache.spark.ml.attribute.NominalAttribute
withoutName() - Method in class org.apache.spark.ml.attribute.NumericAttribute
withoutName() - Static method in class org.apache.spark.ml.attribute.UnresolvedAttribute
withoutNumValues() - Method in class org.apache.spark.ml.attribute.NominalAttribute: Copy without the numValues.
withoutSparsity() - Method in class org.apache.spark.ml.attribute.NumericAttribute: Copy without the sparsity.
withoutStd() - Method in class org.apache.spark.ml.attribute.NumericAttribute: Copy without the standard deviation.
withoutSummary() - Method in class org.apache.spark.ml.attribute.NumericAttribute: Copy without summary statistics.
withoutValues() - Method in class org.apache.spark.ml.attribute.BinaryAttribute: Copy without the values.
withoutValues() - Method in class org.apache.spark.ml.attribute.NominalAttribute: Copy without the values.
withPathFilter(double, SparkSession, long, Function0<T>) - Static method in class org.apache.spark.ml.image.SamplePathFilter: Sets the HDFS PathFilter flag and then restores it.
withPosition(Option<Object>, Option<Object>) - Method in exception org.apache.spark.sql.AnalysisException
withRecursiveFlag(boolean, SparkSession, Function0<T>) - Static method in class org.apache.spark.ml.image.RecursiveFlag: Sets the spark recursive flag and then restores it.
withSparkUI(String, Option<String>, Function1<org.apache.spark.ui.SparkUI, T>) - Method in interface org.apache.spark.status.api.v1.UIRoot: Runs some code with the current SparkUI instance for the app / attempt.
withSparsity(double) - Method in class org.apache.spark.ml.attribute.NumericAttribute: Copy with a new sparsity.
withStd(double) - Method in class org.apache.spark.ml.attribute.NumericAttribute: Copy with a new standard deviation.
withStd() - Method in interface org.apache.spark.ml.feature.StandardScalerParams: Whether to scale the data to unit standard deviation.
withStd() - Method in class org.apache.spark.mllib.feature.StandardScalerModel
withUI(Function1<org.apache.spark.ui.SparkUI, T>) - Method in interface org.apache.spark.status.api.v1.BaseAppResource
withValues(String, String) - Method in class org.apache.spark.ml.attribute.BinaryAttribute: Copy with new values.
withValues(String, String...) - Method in class org.apache.spark.ml.attribute.NominalAttribute: Copy with new values and empty numValues.
withValues(String[]) - Method in class org.apache.spark.ml.attribute.NominalAttribute: Copy with new values and empty numValues.
withValues(String, Seq<String>) - Method in class org.apache.spark.ml.attribute.NominalAttribute: Copy with new values and empty numValues.
withWatermark(String, String) - Method in class org.apache.spark.sql.Dataset: Defines an event time watermark for this Dataset.
word() - Method in class org.apache.spark.mllib.feature.VocabWord
Word2Vec - Class in org.apache.spark.ml.feature: Word2Vec trains a model of Map(String, Vector), i.e.
Word2Vec(String) - Constructor for class org.apache.spark.ml.feature.Word2Vec
Word2Vec() - Constructor for class org.apache.spark.ml.feature.Word2Vec
Word2Vec - Class in org.apache.spark.mllib.feature: Word2Vec creates vector representation of words in a text corpus.
Word2Vec() - Constructor for class org.apache.spark.mllib.feature.Word2Vec
Word2VecBase - Interface in org.apache.spark.ml.feature: Params for Word2Vec and Word2VecModel.
Word2VecModel - Class in org.apache.spark.ml.feature: Model fitted by Word2Vec.
Word2VecModel - Class in org.apache.spark.mllib.feature: Word2Vec model param: wordIndex maps each word to an index, which can retrieve the corresponding vector from wordVectors param: wordVectors array of length numWords * vectorSize, vector corresponding to the word mapped with index i can be retrieved by the slice (i * vectorSize, i * vectorSize + vectorSize)
Word2VecModel(Map<String, float[]>) - Constructor for class org.apache.spark.mllib.feature.Word2VecModel
Word2VecModel.Word2VecModelWriter$ - Class in org.apache.spark.ml.feature
Word2VecModelWriter$() - Constructor for class org.apache.spark.ml.feature.Word2VecModel.Word2VecModelWriter$
workerId() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RemoveWorker
workerRemoved(String, String, String) - Method in interface org.apache.spark.scheduler.TaskScheduler: Process a removed worker
Workspace(int) - Constructor for class org.apache.spark.mllib.optimization.NNLS.Workspace
wrap(Object, ObjectInspector, DataType) - Method in interface org.apache.spark.sql.hive.HiveInspectors
wrap(InternalRow, Function1<Object, Object>[], Object[], DataType[]) - Method in interface org.apache.spark.sql.hive.HiveInspectors
wrap(Seq<Object>, Function1<Object, Object>[], Object[], DataType[]) - Method in interface org.apache.spark.sql.hive.HiveInspectors
wrap(Object, ObjectInspector, DataType) - Static method in class org.apache.spark.sql.hive.orc.OrcFileFormat
wrap(InternalRow, Function1<Object, Object>[], Object[], DataType[]) - Static method in class org.apache.spark.sql.hive.orc.OrcFileFormat
wrap(Seq<Object>, Function1<Object, Object>[], Object[], DataType[]) - Static method in class org.apache.spark.sql.hive.orc.OrcFileFormat
wrapperClass() - Static method in class org.apache.spark.serializer.JavaIterableWrapperSerializer
wrapperFor(ObjectInspector, DataType) - Method in interface org.apache.spark.sql.hive.HiveInspectors: Wraps with Hive types based on object inspector.
wrapperToFileSinkDesc(HiveShim.ShimFileSinkDesc) - Static method in class org.apache.spark.sql.hive.HiveShim
wrapRDD(RDD<Double>) - Method in class org.apache.spark.api.java.JavaDoubleRDD
wrapRDD(RDD<Tuple2<K, V>>) - Method in class org.apache.spark.api.java.JavaPairRDD
wrapRDD(RDD<T>) - Method in class org.apache.spark.api.java.JavaRDD
wrapRDD(RDD<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
wrapRDD(RDD<T>) - Method in class org.apache.spark.streaming.api.java.JavaDStream
wrapRDD(RDD<T>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
wrapRDD(RDD<Tuple2<K, V>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
write(Tuple2<K, V>) - Method in class org.apache.spark.internal.io.HadoopWriteConfigUtil
write(RDD<Tuple2<K, V>>, HadoopWriteConfigUtil<K, V>, ClassTag<V>) - Static method in class org.apache.spark.internal.io.SparkHadoopWriter: Basic work flow of this command is: 1.
write(int) - Method in class org.apache.spark.io.SnappyOutputStreamWrapper
write(byte[]) - Method in class org.apache.spark.io.SnappyOutputStreamWrapper
write(byte[], int, int) - Method in class org.apache.spark.io.SnappyOutputStreamWrapper
write() - Method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
write() - Method in class org.apache.spark.ml.classification.GBTClassificationModel
write() - Method in class org.apache.spark.ml.classification.LinearSVCModel
write() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel: Returns a MLWriter instance for this ML instance.
write() - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassificationModel
write() - Method in class org.apache.spark.ml.classification.NaiveBayesModel
write() - Method in class org.apache.spark.ml.classification.OneVsRest
write() - Method in class org.apache.spark.ml.classification.OneVsRestModel
write() - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel
write() - Method in class org.apache.spark.ml.clustering.BisectingKMeansModel
write() - Method in class org.apache.spark.ml.clustering.DistributedLDAModel
write() - Method in class org.apache.spark.ml.clustering.GaussianMixtureModel: Returns a MLWriter instance for this ML instance.
write(String, SparkSession, Map<String, String>, PipelineStage) - Method in class org.apache.spark.ml.clustering.InternalKMeansModelWriter
write() - Method in class org.apache.spark.ml.clustering.KMeansModel: Returns a GeneralMLWriter instance for this ML instance.
write() - Method in class org.apache.spark.ml.clustering.LocalLDAModel
write(String, SparkSession, Map<String, String>, PipelineStage) - Method in class org.apache.spark.ml.clustering.PMMLKMeansModelWriter
write() - Method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSHModel
write() - Method in class org.apache.spark.ml.feature.ChiSqSelectorModel
write() - Method in class org.apache.spark.ml.feature.ColumnPruner
write() - Method in class org.apache.spark.ml.feature.CountVectorizerModel
write() - Method in class org.apache.spark.ml.feature.IDFModel
write() - Method in class org.apache.spark.ml.feature.ImputerModel
write() - Method in class org.apache.spark.ml.feature.MaxAbsScalerModel
write() - Method in class org.apache.spark.ml.feature.MinHashLSHModel
write() - Method in class org.apache.spark.ml.feature.MinMaxScalerModel
write() - Method in class org.apache.spark.ml.feature.OneHotEncoderModel
write() - Method in class org.apache.spark.ml.feature.PCAModel
write() - Method in class org.apache.spark.ml.feature.RFormulaModel
write() - Method in class org.apache.spark.ml.feature.StandardScalerModel
write() - Method in class org.apache.spark.ml.feature.StringIndexerModel
write() - Method in class org.apache.spark.ml.feature.VectorAttributeRewriter
write() - Method in class org.apache.spark.ml.feature.VectorIndexerModel
write() - Method in class org.apache.spark.ml.feature.Word2VecModel
write() - Method in class org.apache.spark.ml.fpm.FPGrowthModel
write() - Method in class org.apache.spark.ml.Pipeline
write() - Method in class org.apache.spark.ml.PipelineModel
write() - Method in class org.apache.spark.ml.recommendation.ALSModel
write() - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
write() - Method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
write() - Method in class org.apache.spark.ml.regression.GBTRegressionModel
write() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionModel: Returns a MLWriter instance for this ML instance.
write(String, SparkSession, Map<String, String>, PipelineStage) - Method in class org.apache.spark.ml.regression.InternalLinearRegressionModelWriter
write() - Method in class org.apache.spark.ml.regression.IsotonicRegressionModel
write() - Method in class org.apache.spark.ml.regression.LinearRegressionModel: Returns a GeneralMLWriter instance for this ML instance.
write(String, SparkSession, Map<String, String>, PipelineStage) - Method in class org.apache.spark.ml.regression.PMMLLinearRegressionModelWriter
write() - Method in class org.apache.spark.ml.regression.RandomForestRegressionModel
write() - Method in class org.apache.spark.ml.tuning.CrossValidator
write() - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
write() - Method in class org.apache.spark.ml.tuning.TrainValidationSplit
write() - Method in class org.apache.spark.ml.tuning.TrainValidationSplitModel
write() - Method in interface org.apache.spark.ml.util.DefaultParamsWritable
write() - Method in interface org.apache.spark.ml.util.GeneralMLWritable: Returns an MLWriter instance for this ML instance.
write() - Method in interface org.apache.spark.ml.util.MLWritable: Returns an MLWriter instance for this ML instance.
write(String, SparkSession, Map<String, String>, PipelineStage) - Method in interface org.apache.spark.ml.util.MLWriterFormat: Function to write the provided pipeline stage out.
write(Kryo, Output, Iterable<?>) - Method in class org.apache.spark.serializer.JavaIterableWrapperSerializer
write() - Method in class org.apache.spark.sql.Dataset: Interface for saving the content of the non-streaming Dataset out into external storage.
write(InternalRow) - Method in class org.apache.spark.sql.hive.execution.HiveOutputWriter
write(T) - Method in interface org.apache.spark.sql.sources.v2.writer.DataWriter: Writes one record.
write(ByteBuffer) - Method in class org.apache.spark.storage.CountingWritableChannel
write(int) - Method in class org.apache.spark.storage.TimeTrackingOutputStream
write(byte[]) - Method in class org.apache.spark.storage.TimeTrackingOutputStream
write(byte[], int, int) - Method in class org.apache.spark.storage.TimeTrackingOutputStream
write(ByteBuffer, long) - Method in class org.apache.spark.streaming.util.WriteAheadLog: Write the record to the log and return a record handle, which contains all the information necessary to read back the written record.
WRITE_TIME() - Method in class org.apache.spark.InternalAccumulator.shuffleWrite$
WriteAheadLog - Class in org.apache.spark.streaming.util: :: DeveloperApi :: This abstract class represents a write ahead log (aka journal) that is used by Spark Streaming to save the received data (by receivers) and associated metadata to a reliable storage, so that they can be recovered after driver failures.
WriteAheadLog() - Constructor for class org.apache.spark.streaming.util.WriteAheadLog
WriteAheadLogRecordHandle - Class in org.apache.spark.streaming.util: :: DeveloperApi :: This abstract class represents a handle that refers to a record written in a WriteAheadLog.
WriteAheadLogRecordHandle() - Constructor for class org.apache.spark.streaming.util.WriteAheadLogRecordHandle
WriteAheadLogUtils - Class in org.apache.spark.streaming.util: A helper class with utility functions related to the WriteAheadLog interface
WriteAheadLogUtils() - Constructor for class org.apache.spark.streaming.util.WriteAheadLogUtils
writeAll(Iterator<T>, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializationStream
writeBoolean(DataOutputStream, boolean) - Static method in class org.apache.spark.api.r.SerDe
writeBooleanArr(DataOutputStream, boolean[]) - Static method in class org.apache.spark.api.r.SerDe
writeByteBuffer(ByteBuffer, DataOutput) - Static method in class org.apache.spark.util.Utils: Primitive often used when writing ByteBuffer to DataOutput
writeByteBuffer(ByteBuffer, OutputStream) - Static method in class org.apache.spark.util.Utils: Primitive often used when writing ByteBuffer to OutputStream
writeBytes(DataOutputStream, byte[]) - Static method in class org.apache.spark.api.r.SerDe
writeBytes() - Method in class org.apache.spark.status.api.v1.ShuffleWriteMetricDistributions
writeDate(DataOutputStream, Date) - Static method in class org.apache.spark.api.r.SerDe
writeDouble(DataOutputStream, double) - Static method in class org.apache.spark.api.r.SerDe
writeDoubleArr(DataOutputStream, double[]) - Static method in class org.apache.spark.api.r.SerDe
writeEventLogs(String, Option<String>, ZipOutputStream) - Method in interface org.apache.spark.status.api.v1.UIRoot: Write the event logs for the given app to the ZipOutputStream instance.
writeExternal(ObjectOutput) - Method in class org.apache.spark.serializer.JavaSerializer
writeExternal(ObjectOutput) - Method in class org.apache.spark.storage.BlockManagerId
writeExternal(ObjectOutput) - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
writeExternal(ObjectOutput) - Method in class org.apache.spark.storage.StorageLevel
writeInt(DataOutputStream, int) - Static method in class org.apache.spark.api.r.SerDe
writeIntArr(DataOutputStream, int[]) - Static method in class org.apache.spark.api.r.SerDe
writeJObj(DataOutputStream, Object, JVMObjectTracker) - Static method in class org.apache.spark.api.r.SerDe
writeKey(T, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializationStream: Writes the object representing the key of a key-value pair.
writeObject(DataOutputStream, Object, JVMObjectTracker) - Static method in class org.apache.spark.api.r.SerDe
writeObject(T, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializationStream: The most general-purpose method to write an object.
WriterCommitMessage - Interface in org.apache.spark.sql.sources.v2.writer: A commit message returned by DataWriter.commit() and will be sent back to the driver side as the input parameter of DataSourceWriter.commit(WriterCommitMessage[]).
writeRecords() - Method in class org.apache.spark.status.api.v1.ShuffleWriteMetricDistributions
writeSqlObject(DataOutputStream, Object) - Static method in class org.apache.spark.sql.api.r.SQLUtils
writeStream() - Method in class org.apache.spark.sql.Dataset: Interface for saving the content of the streaming Dataset out into external storage.
writeString(DataOutputStream, String) - Static method in class org.apache.spark.api.r.SerDe
writeStringArr(DataOutputStream, String[]) - Static method in class org.apache.spark.api.r.SerDe
WriteSupport - Interface in org.apache.spark.sql.sources.v2: A mix-in interface for DataSourceV2.
writeTime(DataOutputStream, Time) - Static method in class org.apache.spark.api.r.SerDe
writeTime(DataOutputStream, Timestamp) - Static method in class org.apache.spark.api.r.SerDe
writeTime() - Method in class org.apache.spark.status.api.v1.ShuffleWriteMetricDistributions
writeTime() - Method in class org.apache.spark.status.api.v1.ShuffleWriteMetrics
writeTo(OutputStream) - Method in class org.apache.spark.util.sketch.BloomFilter: Writes out this BloomFilter to an output stream in binary format.
writeTo(OutputStream) - Method in class org.apache.spark.util.sketch.CountMinSketch: Writes out this CountMinSketch to an output stream in binary format.
writeType(DataOutputStream, String) - Static method in class org.apache.spark.api.r.SerDe
writeValue(T, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializationStream: Writes the object representing the value of a key-value pair.

X

x() - Method in class org.apache.spark.mllib.optimization.NNLS.Workspace

Y

year(Column) - Static method in class org.apache.spark.sql.functions: Extracts the year as an integer from a given date/timestamp/string.

Z

zero() - Method in class org.apache.spark.Accumulable: Deprecated.
zero(R) - Method in interface org.apache.spark.AccumulableParam: Deprecated.

Return the "zero" (identity) value for an accumulator type, given its initial value.
zero(double) - Method in class org.apache.spark.AccumulatorParam.DoubleAccumulatorParam$: Deprecated.
zero(float) - Method in class org.apache.spark.AccumulatorParam.FloatAccumulatorParam$: Deprecated.
zero(int) - Method in class org.apache.spark.AccumulatorParam.IntAccumulatorParam$: Deprecated.
zero(long) - Method in class org.apache.spark.AccumulatorParam.LongAccumulatorParam$: Deprecated.
zero(String) - Method in class org.apache.spark.AccumulatorParam.StringAccumulatorParam$: Deprecated.
zero(int, int) - Static method in class org.apache.spark.mllib.clustering.ExpectationSum
zero() - Method in class org.apache.spark.sql.expressions.Aggregator: A zero value for this aggregation.
zeros(int, int) - Static method in class org.apache.spark.ml.linalg.DenseMatrix: Generate a DenseMatrix consisting of zeros.
zeros(int, int) - Static method in class org.apache.spark.ml.linalg.Matrices: Generate a Matrix consisting of zeros.
zeros(int) - Static method in class org.apache.spark.ml.linalg.Vectors: Creates a vector of all zeros.
zeros(int, int) - Static method in class org.apache.spark.mllib.linalg.DenseMatrix: Generate a DenseMatrix consisting of zeros.
zeros(int, int) - Static method in class org.apache.spark.mllib.linalg.Matrices: Generate a Matrix consisting of zeros.
zeros(int) - Static method in class org.apache.spark.mllib.linalg.Vectors: Creates a vector of all zeros.
zip(JavaRDDLike<U, ?>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Zips this RDD with another one, returning key-value pairs with the first element in each RDD, second element in each RDD, etc.
zip(RDD, ClassTag) - Method in class org.apache.spark.rdd.RDD: Zips this RDD with another one, returning key-value pairs with the first element in each RDD, second element in each RDD, etc.
zipPartitions(JavaRDDLike<U, ?>, FlatMapFunction2<Iterator<T>, Iterator, V>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Zip this RDD's partitions with one (or more) RDD(s) and return a new RDD by applying a function to the zipped partitions.
zipPartitions(RDD, boolean, Function2<Iterator<T>, Iterator, Iterator<V>>, ClassTag, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD: Zip this RDD's partitions with one (or more) RDD(s) and return a new RDD by applying a function to the zipped partitions.
zipPartitions(RDD, Function2<Iterator<T>, Iterator, Iterator<V>>, ClassTag, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
zipPartitions(RDD, RDD<C>, boolean, Function3<Iterator<T>, Iterator, Iterator<C>, Iterator<V>>, ClassTag, ClassTag<C>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
zipPartitions(RDD, RDD<C>, Function3<Iterator<T>, Iterator, Iterator<C>, Iterator<V>>, ClassTag, ClassTag<C>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
zipPartitions(RDD, RDD<C>, RDD<D>, boolean, Function4<Iterator<T>, Iterator, Iterator<C>, Iterator<D>, Iterator<V>>, ClassTag, ClassTag<C>, ClassTag<D>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
zipPartitions(RDD, RDD<C>, RDD<D>, Function4<Iterator<T>, Iterator, Iterator<C>, Iterator<D>, Iterator<V>>, ClassTag, ClassTag<C>, ClassTag<D>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
zipWithIndex() - Method in interface org.apache.spark.api.java.JavaRDDLike: Zips this RDD with its element indices.
zipWithIndex() - Method in class org.apache.spark.rdd.RDD: Zips this RDD with its element indices.
zipWithUniqueId() - Method in interface org.apache.spark.api.java.JavaRDDLike: Zips this RDD with generated unique Long ids.
zipWithUniqueId() - Method in class org.apache.spark.rdd.RDD: Zips this RDD with generated unique Long ids.
ZStdCompressionCodec - Class in org.apache.spark.io: :: DeveloperApi :: ZStandard implementation of CompressionCodec.
ZStdCompressionCodec(SparkConf) - Constructor for class org.apache.spark.io.ZStdCompressionCodec

_

_1() - Method in class org.apache.spark.util.MutablePair
_2() - Method in class org.apache.spark.util.MutablePair

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z _