Index

A B C D E F G H I J K L M N O P Q R S T U V W Y Z _

A

abs(Column) - Static method in class org.apache.spark.sql.functions: Computes the absolute value.
abs() - Method in class org.apache.spark.sql.types.Decimal
AbsoluteError - Class in org.apache.spark.mllib.tree.loss: :: DeveloperApi :: Class for absolute error loss calculation (for regression).
AbsoluteError() - Constructor for class org.apache.spark.mllib.tree.loss.AbsoluteError
accessTime() - Method in class org.apache.spark.sql.sources.HadoopFsRelation.FakeFileStatus
accId() - Method in class org.apache.spark.CleanAccum
Accumulable<R,T> - Class in org.apache.spark: A data type that can be accumulated, ie has an commutative and associative "add" operation, but where the result type, R, may be different from the element type being added, T.
Accumulable(R, AccumulableParam<R, T>, Option<String>) - Constructor for class org.apache.spark.Accumulable
Accumulable(R, AccumulableParam<R, T>) - Constructor for class org.apache.spark.Accumulable
accumulable(T, AccumulableParam<T, R>) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulable shared variable of the given type, to which tasks can "add" values with add.
accumulable(T, String, AccumulableParam<T, R>) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulable shared variable of the given type, to which tasks can "add" values with add.
accumulable(R, AccumulableParam<R, T>) - Method in class org.apache.spark.SparkContext: Create an Accumulable shared variable, to which tasks can add values with +=.
accumulable(R, String, AccumulableParam<R, T>) - Method in class org.apache.spark.SparkContext: Create an Accumulable shared variable, with a name for display in the Spark UI.
accumulableCollection(R, Function1<R, Growable<T>>, ClassTag<R>) - Method in class org.apache.spark.SparkContext: Create an accumulator from a "mutable collection" type.
AccumulableInfo - Class in org.apache.spark.scheduler: :: DeveloperApi :: Information about an Accumulable modified during a task or stage.
AccumulableInfo - Class in org.apache.spark.status.api.v1
AccumulableParam<R,T> - Interface in org.apache.spark: Helper object defining how to accumulate values of a particular type.
accumulables() - Method in class org.apache.spark.scheduler.StageInfo: Terminal values of accumulables updated during this stage.
accumulables() - Method in class org.apache.spark.scheduler.TaskInfo: Intermediate updates to accumulables during this task.
Accumulator<T> - Class in org.apache.spark: A simpler value of Accumulable where the result type being accumulated is the same as the types of elements being merged, i.e.
Accumulator(T, AccumulatorParam<T>, Option<String>) - Constructor for class org.apache.spark.Accumulator
Accumulator(T, AccumulatorParam<T>) - Constructor for class org.apache.spark.Accumulator
accumulator(int) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulator integer variable, which tasks can "add" values to using the add method.
accumulator(int, String) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulator integer variable, which tasks can "add" values to using the add method.
accumulator(double) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulator double variable, which tasks can "add" values to using the add method.
accumulator(double, String) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulator double variable, which tasks can "add" values to using the add method.
accumulator(T, AccumulatorParam<T>) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulator variable of a given type, which tasks can "add" values to using the add method.
accumulator(T, String, AccumulatorParam<T>) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulator variable of a given type, which tasks can "add" values to using the add method.
accumulator(T, AccumulatorParam<T>) - Method in class org.apache.spark.SparkContext: Create an Accumulator variable of a given type, which tasks can "add" values to using the += method.
accumulator(T, String, AccumulatorParam<T>) - Method in class org.apache.spark.SparkContext: Create an Accumulator variable of a given type, with a name for display in the Spark UI.
AccumulatorParam<T> - Interface in org.apache.spark: A simpler version of AccumulableParam where the only data type you can add in is the same type as the accumulated value.
AccumulatorParam.DoubleAccumulatorParam$ - Class in org.apache.spark
AccumulatorParam.DoubleAccumulatorParam$() - Constructor for class org.apache.spark.AccumulatorParam.DoubleAccumulatorParam$
AccumulatorParam.FloatAccumulatorParam$ - Class in org.apache.spark
AccumulatorParam.FloatAccumulatorParam$() - Constructor for class org.apache.spark.AccumulatorParam.FloatAccumulatorParam$
AccumulatorParam.IntAccumulatorParam$ - Class in org.apache.spark
AccumulatorParam.IntAccumulatorParam$() - Constructor for class org.apache.spark.AccumulatorParam.IntAccumulatorParam$
AccumulatorParam.LongAccumulatorParam$ - Class in org.apache.spark
AccumulatorParam.LongAccumulatorParam$() - Constructor for class org.apache.spark.AccumulatorParam.LongAccumulatorParam$
accumulatorUpdates() - Method in class org.apache.spark.status.api.v1.StageData
accumulatorUpdates() - Method in class org.apache.spark.status.api.v1.TaskData
accuracy() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics: Returns accuracy
acos(Column) - Static method in class org.apache.spark.sql.functions: Computes the cosine inverse of the given value; the returned angle is in the range 0.0 through pi.
acos(String) - Static method in class org.apache.spark.sql.functions: Computes the cosine inverse of the given column; the returned angle is in the range 0.0 through pi.
active() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
activeJobs() - Method in class org.apache.spark.ui.jobs.JobProgressListener
activeStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
activeTasks() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
ActorHelper - Interface in org.apache.spark.streaming.receiver: :: DeveloperApi :: A receiver trait to be mixed in with your Actor to gain access to the API for pushing received data into Spark Streaming for being processed.
actorStream(Props, String, StorageLevel, SupervisorStrategy) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream with any arbitrary user implemented actor receiver.
actorStream(Props, String, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream with any arbitrary user implemented actor receiver.
actorStream(Props, String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream with any arbitrary user implemented actor receiver.
actorStream(Props, String, StorageLevel, SupervisorStrategy, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext: Create an input stream with any arbitrary user implemented actor receiver.
ActorSupervisorStrategy - Class in org.apache.spark.streaming.receiver: :: DeveloperApi :: A helper with set of defaults for supervisor strategy
ActorSupervisorStrategy() - Constructor for class org.apache.spark.streaming.receiver.ActorSupervisorStrategy
actorSystem() - Method in class org.apache.spark.SparkEnv
add(T) - Method in class org.apache.spark.Accumulable: Add more data to this accumulator / accumulable
add(org.apache.spark.ml.feature.Instance) - Method in class org.apache.spark.ml.classification.LogisticAggregator: Add a new training instance to this LogisticAggregator, and update the loss and gradient of the objective function.
add(AFTPoint) - Method in class org.apache.spark.ml.regression.AFTAggregator
add(org.apache.spark.ml.feature.Instance) - Method in class org.apache.spark.ml.regression.LeastSquaresAggregator: Add a new training instance to this LeastSquaresAggregator, and update the loss and gradient of the objective function.
add(double[], MultivariateGaussian[], ExpectationSum, Vector<Object>) - Static method in class org.apache.spark.mllib.clustering.ExpectationSum
add(Vector) - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator: Adds a new document.
add(BlockMatrix) - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix: Adds two block matrices together.
add(Vector) - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer: Add a new sample to this summarizer, and update the statistical summary.
add(StructField) - Method in class org.apache.spark.sql.types.StructType: Creates a new StructType by adding a new field.
add(String, DataType) - Method in class org.apache.spark.sql.types.StructType: Creates a new StructType by adding a new nullable field with no metadata.
add(String, DataType, boolean) - Method in class org.apache.spark.sql.types.StructType: Creates a new StructType by adding a new field with no metadata.
add(String, DataType, boolean, Metadata) - Method in class org.apache.spark.sql.types.StructType: Creates a new StructType by adding a new field and specifying metadata.
add(String, String) - Method in class org.apache.spark.sql.types.StructType: Creates a new StructType by adding a new nullable field with no metadata where the dataType is specified as a String.
add(String, String, boolean) - Method in class org.apache.spark.sql.types.StructType: Creates a new StructType by adding a new field with no metadata where the dataType is specified as a String.
add(String, String, boolean, Metadata) - Method in class org.apache.spark.sql.types.StructType: Creates a new StructType by adding a new field and specifying metadata where the dataType is specified as a String.
add(Vector) - Method in class org.apache.spark.util.Vector
add_months(Column, int) - Static method in class org.apache.spark.sql.functions: Returns the date that is numMonths after startDate.
addAccumulator(R, T) - Method in interface org.apache.spark.AccumulableParam: Add additional data to the accumulator value.
addAccumulator(T, T) - Method in interface org.apache.spark.AccumulatorParam
addAppArgs(String...) - Method in class org.apache.spark.launcher.SparkLauncher: Adds command line arguments for the application.
addedFiles() - Method in class org.apache.spark.SparkContext
addedJars() - Method in class org.apache.spark.SparkContext
addFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Add a file to be downloaded with this Spark job on every node.
addFile(String) - Method in class org.apache.spark.launcher.SparkLauncher: Adds a file to be submitted with the application.
addFile(String) - Method in class org.apache.spark.SparkContext: Add a file to be downloaded with this Spark job on every node.
addFile(String, boolean) - Method in class org.apache.spark.SparkContext: Add a file to be downloaded with this Spark job on every node.
addGrid(Param<T>, Iterable<T>) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder: Adds a param with multiple values (overwrites if the input param exists).
addGrid(DoubleParam, double[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder: Adds a double param with multiple values.
addGrid(IntParam, int[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder: Adds a int param with multiple values.
addGrid(FloatParam, float[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder: Adds a float param with multiple values.
addGrid(LongParam, long[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder: Adds a long param with multiple values.
addGrid(BooleanParam) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder: Adds a boolean param with true and false.
addInPlace(R, R) - Method in interface org.apache.spark.AccumulableParam: Merge two accumulated values together.
addInPlace(double, double) - Method in class org.apache.spark.AccumulatorParam.DoubleAccumulatorParam$
addInPlace(float, float) - Method in class org.apache.spark.AccumulatorParam.FloatAccumulatorParam$
addInPlace(int, int) - Method in class org.apache.spark.AccumulatorParam.IntAccumulatorParam$
addInPlace(long, long) - Method in class org.apache.spark.AccumulatorParam.LongAccumulatorParam$
addInPlace(double, double) - Method in class org.apache.spark.SparkContext.DoubleAccumulatorParam$
addInPlace(float, float) - Method in class org.apache.spark.SparkContext.FloatAccumulatorParam$
addInPlace(int, int) - Method in class org.apache.spark.SparkContext.IntAccumulatorParam$
addInPlace(long, long) - Method in class org.apache.spark.SparkContext.LongAccumulatorParam$
addInPlace(Vector) - Method in class org.apache.spark.util.Vector
addInPlace(Vector, Vector) - Method in class org.apache.spark.util.Vector.VectorAccumParam$
addIntercept() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm: Whether to add intercept (default: false).
addJar(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Adds a JAR dependency for all tasks to be executed on this SparkContext in the future.
addJar(String) - Method in class org.apache.spark.launcher.SparkLauncher: Adds a jar file to be submitted with the application.
addJar(String) - Method in class org.apache.spark.SparkContext: Adds a JAR dependency for all tasks to be executed on this SparkContext in the future.
addJar(String) - Method in class org.apache.spark.sql.hive.HiveContext
addJar(String) - Method in class org.apache.spark.sql.SQLContext: Add a jar to SQLContext
addListener(SparkAppHandle.Listener) - Method in interface org.apache.spark.launcher.SparkAppHandle: Adds a listener to be notified of changes to the handle's information.
addLocalConfiguration(String, int, int, int, JobConf) - Static method in class org.apache.spark.rdd.HadoopRDD: Add Hadoop configuration specific to a single partition and attempt.
addOnCompleteCallback(Function0<BoxedUnit>) - Method in class org.apache.spark.TaskContext: Adds a callback function to be executed on task completion.
addPartToPGroup(Partition, PartitionGroup) - Method in class org.apache.spark.rdd.PartitionCoalescer
addPyFile(String) - Method in class org.apache.spark.launcher.SparkLauncher: Adds a python file / zip / egg to be submitted with the application.
address() - Method in class org.apache.spark.status.api.v1.RDDDataDistribution
addSparkArg(String) - Method in class org.apache.spark.launcher.SparkLauncher: Adds a no-value argument to the Spark invocation.
addSparkArg(String, String) - Method in class org.apache.spark.launcher.SparkLauncher: Adds an argument with a value to the Spark invocation.
addSparkListener(SparkListener) - Method in class org.apache.spark.SparkContext: :: DeveloperApi :: Register a listener to receive up-calls from events that happen during execution.
addStreamingListener(StreamingListener) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Add a StreamingListener object for receiving system events related to streaming.
addStreamingListener(StreamingListener) - Method in class org.apache.spark.streaming.StreamingContext: Add a StreamingListener object for receiving system events related to streaming.
addTaskCompletionListener(TaskCompletionListener) - Method in class org.apache.spark.TaskContext: Adds a (Java friendly) listener to be executed on task completion.
addTaskCompletionListener(Function1<TaskContext, BoxedUnit>) - Method in class org.apache.spark.TaskContext: Adds a listener in the form of a Scala closure to be executed on task completion.
AFTAggregator - Class in org.apache.spark.ml.regression
AFTAggregator(DenseVector<Object>, boolean) - Constructor for class org.apache.spark.ml.regression.AFTAggregator
AFTCostFun - Class in org.apache.spark.ml.regression
AFTCostFun(RDD<AFTPoint>, boolean) - Constructor for class org.apache.spark.ml.regression.AFTCostFun
AFTSurvivalRegression - Class in org.apache.spark.ml.regression: :: Experimental :: Fit a parametric survival regression model named accelerated failure time (AFT) model (https://en.wikipedia.org/wiki/Accelerated_failure_time_model) based on the Weibull distribution of the survival time.
AFTSurvivalRegression(String) - Constructor for class org.apache.spark.ml.regression.AFTSurvivalRegression
AFTSurvivalRegression() - Constructor for class org.apache.spark.ml.regression.AFTSurvivalRegression
AFTSurvivalRegressionModel - Class in org.apache.spark.ml.regression: :: Experimental :: Model produced by AFTSurvivalRegression.
agg(Column, Column...) - Method in class org.apache.spark.sql.DataFrame: Aggregates on the entire DataFrame without groups.
agg(Tuple2<String, String>, Seq<Tuple2<String, String>>) - Method in class org.apache.spark.sql.DataFrame: (Scala-specific) Aggregates on the entire DataFrame without groups.
agg(Map<String, String>) - Method in class org.apache.spark.sql.DataFrame: (Scala-specific) Aggregates on the entire DataFrame without groups.
agg(Map<String, String>) - Method in class org.apache.spark.sql.DataFrame: (Java-specific) Aggregates on the entire DataFrame without groups.
agg(Column, Seq<Column>) - Method in class org.apache.spark.sql.DataFrame: Aggregates on the entire DataFrame without groups.
agg(Column, Column...) - Method in class org.apache.spark.sql.GroupedData: Compute aggregates by specifying a series of aggregate columns.
agg(Tuple2<String, String>, Seq<Tuple2<String, String>>) - Method in class org.apache.spark.sql.GroupedData: (Scala-specific) Compute aggregates by specifying a map from column name to aggregate methods.
agg(Map<String, String>) - Method in class org.apache.spark.sql.GroupedData: (Scala-specific) Compute aggregates by specifying a map from column name to aggregate methods.
agg(Map<String, String>) - Method in class org.apache.spark.sql.GroupedData: (Java-specific) Compute aggregates by specifying a map from column name to aggregate methods.
agg(Column, Seq<Column>) - Method in class org.apache.spark.sql.GroupedData: Compute aggregates by specifying a series of aggregate columns.
agg(TypedColumn<V, U1>) - Method in class org.apache.spark.sql.GroupedDataset: Computes the given aggregation, returning a Dataset of tuples for each unique key and the result of computing this aggregation over all elements in the group.
agg(TypedColumn<V, U1>, TypedColumn<V, U2>) - Method in class org.apache.spark.sql.GroupedDataset: Computes the given aggregations, returning a Dataset of tuples for each unique key and the result of computing these aggregations over all elements in the group.
agg(TypedColumn<V, U1>, TypedColumn<V, U2>, TypedColumn<V, U3>) - Method in class org.apache.spark.sql.GroupedDataset: Computes the given aggregations, returning a Dataset of tuples for each unique key and the result of computing these aggregations over all elements in the group.
agg(TypedColumn<V, U1>, TypedColumn<V, U2>, TypedColumn<V, U3>, TypedColumn<V, U4>) - Method in class org.apache.spark.sql.GroupedDataset: Computes the given aggregations, returning a Dataset of tuples for each unique key and the result of computing these aggregations over all elements in the group.
aggregate(U, Function2<U, T, U>, Function2<U, U, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Aggregate the elements of each partition, and then the results for all the partitions, using given combine functions and a neutral "zero value".
aggregate(U, Function2<U, T, U>, Function2<U, U, U>, ClassTag) - Method in class org.apache.spark.rdd.RDD: Aggregate the elements of each partition, and then the results for all the partitions, using given combine functions and a neutral "zero value".
aggregateByKey(U, Partitioner, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD: Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, int, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD: Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD: Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, Partitioner, Function2<U, V, U>, Function2<U, U, U>, ClassTag) - Method in class org.apache.spark.rdd.PairRDDFunctions: Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, int, Function2<U, V, U>, Function2<U, U, U>, ClassTag) - Method in class org.apache.spark.rdd.PairRDDFunctions: Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, Function2<U, V, U>, Function2<U, U, U>, ClassTag) - Method in class org.apache.spark.rdd.PairRDDFunctions: Aggregate the values of each key, using given combine functions and a neutral "zero value".
AggregatedDialect - Class in org.apache.spark.sql.jdbc: AggregatedDialect can unify multiple dialects into one virtual Dialect.
AggregatedDialect(List<JdbcDialect>) - Constructor for class org.apache.spark.sql.jdbc.AggregatedDialect
aggregateMessages(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, ClassTag<A>) - Method in class org.apache.spark.graphx.Graph: Aggregates values from the neighboring edges and vertices of each vertex.
aggregateMessagesWithActiveSet(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, Option<Tuple2<VertexRDD<?>, EdgeDirection>>, ClassTag<A>) - Method in class org.apache.spark.graphx.impl.GraphImpl
aggregateUsingIndex(RDD<Tuple2<Object, VD2>>, Function2<VD2, VD2, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
aggregateUsingIndex(RDD<Tuple2<Object, VD2>>, Function2<VD2, VD2, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD: Aggregates vertices in messages that have the same ids using reduceFunc, returning a VertexRDD co-indexed with this.
AggregatingEdgeContext<VD,ED,A> - Class in org.apache.spark.graphx.impl
AggregatingEdgeContext(Function2<A, A, A>, Object, BitSet) - Constructor for class org.apache.spark.graphx.impl.AggregatingEdgeContext
Aggregator<K,V,C> - Class in org.apache.spark: :: DeveloperApi :: A set of functions used to aggregate data.
Aggregator(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Constructor for class org.apache.spark.Aggregator
aggregator() - Method in class org.apache.spark.ShuffleDependency
Aggregator<I,B,O> - Class in org.apache.spark.sql.expressions: A base class for user-defined aggregations, which can be used in DataFrame and Dataset operations to take all of the elements of a group and reduce them to a single value.
Aggregator() - Constructor for class org.apache.spark.sql.expressions.Aggregator
aggUntyped(Seq<TypedColumn<?, ?>>) - Method in class org.apache.spark.sql.GroupedDataset: Internal helper function for building typed aggregations that return tuples.
Algo - Class in org.apache.spark.mllib.tree.configuration: :: Experimental :: Enum to select the algorithm for the decision tree
Algo() - Constructor for class org.apache.spark.mllib.tree.configuration.Algo
algo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
algo() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
algo() - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
algo() - Method in class org.apache.spark.mllib.tree.model.RandomForestModel
algorithm() - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
algorithm() - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm: The algorithm to use for updating.
algorithm() - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
alias(String) - Method in class org.apache.spark.sql.Column: Gives the column an alias.
alias(String) - Method in class org.apache.spark.sql.DataFrame: Returns a new DataFrame with an alias set.
alias(Symbol) - Method in class org.apache.spark.sql.DataFrame: (Scala-specific) Returns a new DataFrame with an alias set.
All - Static variable in class org.apache.spark.graphx.TripletFields: Expose all the fields (source, edge, and destination).
alpha() - Method in class org.apache.spark.mllib.random.WeibullGenerator
AlphaComponent - Annotation Type in org.apache.spark.annotation: A new component of Spark which may have unstable API's.
ALS - Class in org.apache.spark.ml.recommendation: :: Experimental :: Alternating Least Squares (ALS) matrix factorization.
ALS(String) - Constructor for class org.apache.spark.ml.recommendation.ALS
ALS() - Constructor for class org.apache.spark.ml.recommendation.ALS
ALS - Class in org.apache.spark.mllib.recommendation
ALS() - Constructor for class org.apache.spark.mllib.recommendation.ALS
ALS.Rating<ID> - Class in org.apache.spark.ml.recommendation: :: DeveloperApi :: Rating class for better code readability.
ALS.Rating(ID, ID, float) - Constructor for class org.apache.spark.ml.recommendation.ALS.Rating
ALS.Rating$ - Class in org.apache.spark.ml.recommendation
ALS.Rating$() - Constructor for class org.apache.spark.ml.recommendation.ALS.Rating$
ALSModel - Class in org.apache.spark.ml.recommendation: :: Experimental :: Model fitted by ALS.
AnalysisException - Exception in org.apache.spark.sql: :: DeveloperApi :: Thrown when a query fails to analyze, usually because the query itself is invalid.
AnalysisException(String, Option<Object>, Option<Object>) - Constructor for exception org.apache.spark.sql.AnalysisException
analyze(String) - Method in class org.apache.spark.sql.hive.HiveContext: Analyzes the given table in the current database to generate statistics, which will be used in query optimizations.
analyzer() - Method in class org.apache.spark.sql.hive.HiveContext
analyzer() - Method in class org.apache.spark.sql.SQLContext
and(Column) - Method in class org.apache.spark.sql.Column: Boolean AND.
And - Class in org.apache.spark.sql.sources: A filter that evaluates to true iff both left or right evaluate to true.
And(Filter, Filter) - Constructor for class org.apache.spark.sql.sources.And
antecedent() - Method in class org.apache.spark.mllib.fpm.AssociationRules.Rule
ANY() - Static method in class org.apache.spark.scheduler.TaskLocality
anyNull() - Method in interface org.apache.spark.sql.Row: Returns true if there are any NULL values in this row.
appAttemptId() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
appendBias(Vector) - Static method in class org.apache.spark.mllib.util.MLUtils: Returns a new vector with 1.0 (bias) appended to the input vector.
appId() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
applicationAttemptId() - Method in class org.apache.spark.SparkContext
ApplicationAttemptInfo - Class in org.apache.spark.status.api.v1
applicationId() - Method in class org.apache.spark.SparkContext: A unique identifier for the Spark application.
ApplicationInfo - Class in org.apache.spark.status.api.v1
ApplicationStatus - Enum in org.apache.spark.status.api.v1
apply(RDD<Tuple2<Object, VD>>, RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.Graph: Construct a graph from a collection of vertices and edges with attributes.
apply(RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl: Create a graph from edges, setting referenced vertices to `defaultVertexAttr`.
apply(RDD<Tuple2<Object, VD>>, RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl: Create a graph from vertices and edges, setting missing vertices to `defaultVertexAttr`.
apply(VertexRDD<VD>, EdgeRDD<ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl: Create a graph from a VertexRDD and an EdgeRDD with arbitrary replicated vertices.
apply(Graph<VD, ED>, A, int, EdgeDirection, Function3<Object, VD, A, VD>, Function1<EdgeTriplet<VD, ED>, Iterator<Tuple2<Object, A>>>, Function2<A, A, A>, ClassTag<VD>, ClassTag<ED>, ClassTag<A>) - Static method in class org.apache.spark.graphx.Pregel: Execute a Pregel-like iterative vertex-parallel abstraction.
apply(RDD<Tuple2<Object, VD>>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD: Constructs a standalone VertexRDD (one that is not set up for efficient joins with an EdgeRDD) from an RDD of vertex-attribute pairs.
apply(RDD<Tuple2<Object, VD>>, EdgeRDD<?>, VD, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD: Constructs a VertexRDD from an RDD of vertex-attribute pairs.
apply(RDD<Tuple2<Object, VD>>, EdgeRDD<?>, VD, Function2<VD, VD, VD>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD: Constructs a VertexRDD from an RDD of vertex-attribute pairs.
apply(String) - Method in class org.apache.spark.ml.attribute.AttributeGroup: Gets an attribute by its name.
apply(int) - Method in class org.apache.spark.ml.attribute.AttributeGroup: Gets an attribute by its index.
apply(Param<T>) - Method in class org.apache.spark.ml.param.ParamMap: Gets the value of the input param or its default value if it does not exist.
apply(int, int) - Method in class org.apache.spark.mllib.linalg.DenseMatrix
apply(int) - Method in class org.apache.spark.mllib.linalg.DenseVector
apply(int, int) - Method in interface org.apache.spark.mllib.linalg.Matrix: Gets the (i, j)-th element.
apply(int, int) - Method in class org.apache.spark.mllib.linalg.SparseMatrix
apply(int) - Method in interface org.apache.spark.mllib.linalg.Vector: Gets the value of the ith element.
apply(int, Predict, double, boolean) - Static method in class org.apache.spark.mllib.tree.model.Node: Construct a node with nodeIndex, predict, impurity and isLeaf parameters.
apply(String) - Static method in class org.apache.spark.rdd.PartitionGroup
apply(long, String, Option<String>, String, boolean) - Static method in class org.apache.spark.scheduler.AccumulableInfo
apply(long, String, Option<String>, String) - Static method in class org.apache.spark.scheduler.AccumulableInfo
apply(long, String, String) - Static method in class org.apache.spark.scheduler.AccumulableInfo
apply(long, TaskMetrics) - Static method in class org.apache.spark.scheduler.RuntimePercentage
apply(Object) - Method in class org.apache.spark.sql.Column: Extracts a value or values from a complex type.
apply(String) - Method in class org.apache.spark.sql.DataFrame: Selects column based on the column name and return it as a Column.
apply(Column...) - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction: Creates a Column for this UDAF using given Columns as input arguments.
apply(Seq<Column>) - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction: Creates a Column for this UDAF using given Columns as input arguments.
apply(DataFrame, Seq<Expression>, GroupedData.GroupType) - Static method in class org.apache.spark.sql.GroupedData
apply(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i.
apply(DataType) - Static method in class org.apache.spark.sql.types.ArrayType: Construct a ArrayType object with the given element type.
apply(double) - Static method in class org.apache.spark.sql.types.Decimal
apply(long) - Static method in class org.apache.spark.sql.types.Decimal
apply(int) - Static method in class org.apache.spark.sql.types.Decimal
apply(BigDecimal) - Static method in class org.apache.spark.sql.types.Decimal
apply(BigDecimal) - Static method in class org.apache.spark.sql.types.Decimal
apply(BigDecimal, int, int) - Static method in class org.apache.spark.sql.types.Decimal
apply(BigDecimal, int, int) - Static method in class org.apache.spark.sql.types.Decimal
apply(long, int, int) - Static method in class org.apache.spark.sql.types.Decimal
apply(String) - Static method in class org.apache.spark.sql.types.Decimal
apply() - Static method in class org.apache.spark.sql.types.DecimalType
apply(Option<PrecisionInfo>) - Static method in class org.apache.spark.sql.types.DecimalType
apply(DataType, DataType) - Static method in class org.apache.spark.sql.types.MapType: Construct a MapType object with the given key type and value type.
apply(String) - Method in class org.apache.spark.sql.types.StructType: Extracts a StructField of the given name.
apply(Set<String>) - Method in class org.apache.spark.sql.types.StructType: Returns a StructType containing StructFields of the given names, preserving the original order of fields.
apply(int) - Method in class org.apache.spark.sql.types.StructType
apply(Seq<Column>) - Method in class org.apache.spark.sql.UserDefinedFunction
apply(String) - Static method in class org.apache.spark.storage.BlockId: Converts a BlockId "name" String back into a BlockId.
apply(String, String, int) - Static method in class org.apache.spark.storage.BlockManagerId: Returns a BlockManagerId for the given configuration.
apply(ObjectInput) - Static method in class org.apache.spark.storage.BlockManagerId
apply(boolean, boolean, boolean, boolean, int) - Static method in class org.apache.spark.storage.StorageLevel: :: DeveloperApi :: Create a new StorageLevel object without setting useOffHeap.
apply(boolean, boolean, boolean, int) - Static method in class org.apache.spark.storage.StorageLevel: :: DeveloperApi :: Create a new StorageLevel object.
apply(int, int) - Static method in class org.apache.spark.storage.StorageLevel: :: DeveloperApi :: Create a new StorageLevel object from its integer representation.
apply(ObjectInput) - Static method in class org.apache.spark.storage.StorageLevel: :: DeveloperApi :: Read StorageLevel object from ObjectInput stream.
apply(String, int) - Static method in class org.apache.spark.streaming.kafka.Broker
apply(String, int, long, long) - Static method in class org.apache.spark.streaming.kafka.OffsetRange
apply(TopicAndPartition, long, long) - Static method in class org.apache.spark.streaming.kafka.OffsetRange
apply(long) - Static method in class org.apache.spark.streaming.Milliseconds
apply(long) - Static method in class org.apache.spark.streaming.Minutes
apply(long) - Static method in class org.apache.spark.streaming.Seconds
apply(TraversableOnce<Object>) - Static method in class org.apache.spark.util.StatCounter: Build a StatCounter from a list of values.
apply(Seq<Object>) - Static method in class org.apache.spark.util.StatCounter: Build a StatCounter from a list of values passed as variable-length arguments.
apply(int) - Method in class org.apache.spark.util.Vector
applySchema(RDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
applySchema(JavaRDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
applySchema(RDD<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
applySchema(JavaRDD<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
applySchemaToPythonRDD(RDD<Object[]>, String) - Method in class org.apache.spark.sql.SQLContext
applySchemaToPythonRDD(RDD<Object[]>, StructType) - Method in class org.apache.spark.sql.SQLContext
appName() - Method in class org.apache.spark.api.java.JavaSparkContext
appName() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
appName() - Method in class org.apache.spark.SparkContext
approxCountDistinct(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the approximate number of distinct items in a group.
approxCountDistinct(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the approximate number of distinct items in a group.
approxCountDistinct(Column, double) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the approximate number of distinct items in a group.
approxCountDistinct(String, double) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the approximate number of distinct items in a group.
ApproxHist() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
areaUnderPR() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Computes the area under the precision-recall curve.
areaUnderROC() - Method in class org.apache.spark.ml.classification.BinaryLogisticRegressionSummary: Computes the area under the receiver operating characteristic (ROC) curve.
areaUnderROC() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Computes the area under the receiver operating characteristic (ROC) curve.
argmax() - Method in class org.apache.spark.mllib.linalg.DenseVector
argmax() - Method in class org.apache.spark.mllib.linalg.SparseVector
argmax() - Method in interface org.apache.spark.mllib.linalg.Vector: Find the index of a maximal element.
arr() - Method in class org.apache.spark.rdd.PartitionGroup
array(DataType) - Method in class org.apache.spark.sql.ColumnName: Creates a new StructField of type array.
array(Column...) - Static method in class org.apache.spark.sql.functions: Creates a new array column.
array(String, String...) - Static method in class org.apache.spark.sql.functions: Creates a new array column.
array(Seq<Column>) - Static method in class org.apache.spark.sql.functions: Creates a new array column.
array(String, Seq<String>) - Static method in class org.apache.spark.sql.functions: Creates a new array column.
array_contains(Column, Object) - Static method in class org.apache.spark.sql.functions: Returns true if the array contain the value
arrayLengthGt(double) - Static method in class org.apache.spark.ml.param.ParamValidators: Check that the array length is greater than lowerBound.
ArrayType - Class in org.apache.spark.sql.types
ArrayType(DataType, boolean) - Constructor for class org.apache.spark.sql.types.ArrayType
ArrayType() - Constructor for class org.apache.spark.sql.types.ArrayType: No-arg constructor for kryo.
as(Encoder) - Method in class org.apache.spark.sql.Column: Provides a type hint about the expected return value of this column.
as(String) - Method in class org.apache.spark.sql.Column: Gives the column an alias.
as(Seq<String>) - Method in class org.apache.spark.sql.Column: (Scala-specific) Assigns the given aliases to the results of a table generating function.
as(String[]) - Method in class org.apache.spark.sql.Column: Assigns the given aliases to the results of a table generating function.
as(Symbol) - Method in class org.apache.spark.sql.Column: Gives the column an alias.
as(String, Metadata) - Method in class org.apache.spark.sql.Column: Gives the column an alias with metadata.
as(Encoder) - Method in class org.apache.spark.sql.DataFrame: :: Experimental :: Converts this DataFrame to a strongly-typed Dataset containing objects of the specified type, U.
as(String) - Method in class org.apache.spark.sql.DataFrame: Returns a new DataFrame with an alias set.
as(Symbol) - Method in class org.apache.spark.sql.DataFrame: (Scala-specific) Returns a new DataFrame with an alias set.
as(Encoder) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset where each record has been mapped on to the specified type.
as(String) - Method in class org.apache.spark.sql.Dataset: Applies a logical alias to this Dataset that can be used to disambiguate columns that have the same name after two Datasets have been joined.
asc() - Method in class org.apache.spark.sql.Column: Returns an ordering used in sorting.
asc(String) - Static method in class org.apache.spark.sql.functions: Returns a sort expression based on ascending order of the column.
ascii(Column) - Static method in class org.apache.spark.sql.functions: Computes the numeric value of the first character of the string column, and returns the result as a int column.
asin(Column) - Static method in class org.apache.spark.sql.functions: Computes the sine inverse of the given value; the returned angle is in the range -pi/2 through pi/2.
asin(String) - Static method in class org.apache.spark.sql.functions: Computes the sine inverse of the given column; the returned angle is in the range -pi/2 through pi/2.
asIntegral() - Method in class org.apache.spark.sql.types.DecimalType
asIntegral() - Method in class org.apache.spark.sql.types.DoubleType
asIntegral() - Method in class org.apache.spark.sql.types.FloatType
asIterator() - Method in class org.apache.spark.serializer.DeserializationStream: Read the elements of this stream through an iterator.
asJavaPairRDD() - Method in class org.apache.spark.api.r.PairwiseRRDD
asJavaRDD() - Method in class org.apache.spark.api.r.RRDD
asJavaRDD() - Method in class org.apache.spark.api.r.StringRRDD
asKeyValueIterator() - Method in class org.apache.spark.serializer.DeserializationStream: Read the elements of this stream through an iterator over key-value pairs.
AskPermissionToCommitOutput - Class in org.apache.spark.scheduler
AskPermissionToCommitOutput(int, int, int) - Constructor for class org.apache.spark.scheduler.AskPermissionToCommitOutput
askTimeout(SparkConf) - Static method in class org.apache.spark.util.RpcUtils
asRDDId() - Method in class org.apache.spark.storage.BlockId
assertValid() - Method in class org.apache.spark.broadcast.Broadcast: Check if this broadcast is valid.
assignments() - Method in class org.apache.spark.mllib.clustering.PowerIterationClusteringModel
AssociationRules - Class in org.apache.spark.mllib.fpm: :: Experimental ::
AssociationRules() - Constructor for class org.apache.spark.mllib.fpm.AssociationRules: Constructs a default instance with default parameters {minConfidence = 0.8}.
AssociationRules.Rule<Item> - Class in org.apache.spark.mllib.fpm: :: Experimental ::
AsyncRDDActions<T> - Class in org.apache.spark.rdd: A set of asynchronous RDD actions available through an implicit conversion.
AsyncRDDActions(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.AsyncRDDActions
atan(Column) - Static method in class org.apache.spark.sql.functions: Computes the tangent inverse of the given value.
atan(String) - Static method in class org.apache.spark.sql.functions: Computes the tangent inverse of the given column.
atan2(Column, Column) - Static method in class org.apache.spark.sql.functions: Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).
atan2(Column, String) - Static method in class org.apache.spark.sql.functions: Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).
atan2(String, Column) - Static method in class org.apache.spark.sql.functions: Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).
atan2(String, String) - Static method in class org.apache.spark.sql.functions: Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).
atan2(Column, double) - Static method in class org.apache.spark.sql.functions: Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).
atan2(String, double) - Static method in class org.apache.spark.sql.functions: Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).
atan2(double, Column) - Static method in class org.apache.spark.sql.functions: Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).
atan2(double, String) - Static method in class org.apache.spark.sql.functions: Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).
attempt() - Method in class org.apache.spark.scheduler.TaskInfo
attempt() - Method in class org.apache.spark.status.api.v1.TaskData
attemptId() - Method in class org.apache.spark.scheduler.StageInfo
attemptId() - Method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
attemptId() - Method in class org.apache.spark.status.api.v1.StageData
attemptId() - Method in class org.apache.spark.TaskContext
attemptNumber() - Method in class org.apache.spark.scheduler.AskPermissionToCommitOutput
attemptNumber() - Method in class org.apache.spark.scheduler.TaskInfo
attemptNumber() - Method in class org.apache.spark.TaskCommitDenied
attemptNumber() - Method in class org.apache.spark.TaskContext: How many times this task has been attempted.
attempts() - Method in class org.apache.spark.status.api.v1.ApplicationInfo
attr() - Method in class org.apache.spark.graphx.Edge
attr() - Method in class org.apache.spark.graphx.EdgeContext: The attribute associated with the edge.
attr() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
Attribute - Class in org.apache.spark.ml.attribute: :: DeveloperApi :: Abstract class for ML attributes.
Attribute() - Constructor for class org.apache.spark.ml.attribute.Attribute
attribute() - Method in class org.apache.spark.sql.sources.EqualNullSafe
attribute() - Method in class org.apache.spark.sql.sources.EqualTo
attribute() - Method in class org.apache.spark.sql.sources.GreaterThan
attribute() - Method in class org.apache.spark.sql.sources.GreaterThanOrEqual
attribute() - Method in class org.apache.spark.sql.sources.In
attribute() - Method in class org.apache.spark.sql.sources.IsNotNull
attribute() - Method in class org.apache.spark.sql.sources.IsNull
attribute() - Method in class org.apache.spark.sql.sources.LessThan
attribute() - Method in class org.apache.spark.sql.sources.LessThanOrEqual
attribute() - Method in class org.apache.spark.sql.sources.StringContains
attribute() - Method in class org.apache.spark.sql.sources.StringEndsWith
attribute() - Method in class org.apache.spark.sql.sources.StringStartsWith
AttributeGroup - Class in org.apache.spark.ml.attribute: :: DeveloperApi :: Attributes that describe a vector ML column.
AttributeGroup(String) - Constructor for class org.apache.spark.ml.attribute.AttributeGroup: Creates an attribute group without attribute info.
AttributeGroup(String, int) - Constructor for class org.apache.spark.ml.attribute.AttributeGroup: Creates an attribute group knowing only the number of attributes.
AttributeGroup(String, Attribute[]) - Constructor for class org.apache.spark.ml.attribute.AttributeGroup: Creates an attribute group with attributes.
attributes() - Method in class org.apache.spark.ml.attribute.AttributeGroup: Optional array of attributes.
AttributeType - Class in org.apache.spark.ml.attribute: :: DeveloperApi :: An enum-like type for attribute types: AttributeType$.Numeric, AttributeType$.Nominal, and AttributeType$.Binary.
AttributeType(String) - Constructor for class org.apache.spark.ml.attribute.AttributeType
attrType() - Method in class org.apache.spark.ml.attribute.Attribute: Attribute type.
attrType() - Method in class org.apache.spark.ml.attribute.BinaryAttribute
attrType() - Method in class org.apache.spark.ml.attribute.NominalAttribute
attrType() - Method in class org.apache.spark.ml.attribute.NumericAttribute
attrType() - Static method in class org.apache.spark.ml.attribute.UnresolvedAttribute
available() - Method in class org.apache.spark.storage.BufferReleasingInputStream
avg(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the average of the values in a group.
avg(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the average of the values in a group.
avg(String...) - Method in class org.apache.spark.sql.GroupedData: Compute the mean value for each numeric columns for each group.
avg(Seq<String>) - Method in class org.apache.spark.sql.GroupedData: Compute the mean value for each numeric columns for each group.
avgMetrics() - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
awaitTermination() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Wait for the execution to stop.
awaitTermination(long) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Deprecated.
As of 1.3.0, replaced by awaitTerminationOrTimeout(Long).
awaitTermination() - Method in class org.apache.spark.streaming.StreamingContext: Wait for the execution to stop.
awaitTermination(long) - Method in class org.apache.spark.streaming.StreamingContext: Deprecated.
As of 1.3.0, replaced by awaitTerminationOrTimeout(Long).
awaitTerminationOrTimeout(long) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Wait for the execution to stop.
awaitTerminationOrTimeout(long) - Method in class org.apache.spark.streaming.StreamingContext: Wait for the execution to stop.

B

base64(Column) - Static method in class org.apache.spark.sql.functions: Computes the BASE64 encoding of a binary column and returns it as a string column.
baseOn(ParamPair<?>...) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder: Sets the given parameters in this grid to fixed values.
baseOn(ParamMap) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder: Sets the given parameters in this grid to fixed values.
baseOn(Seq<ParamPair<?>>) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder: Sets the given parameters in this grid to fixed values.
BaseRelation - Class in org.apache.spark.sql.sources: ::DeveloperApi:: Represents a collection of tuples with a known schema.
BaseRelation() - Constructor for class org.apache.spark.sql.sources.BaseRelation
baseRelationToDataFrame(BaseRelation) - Method in class org.apache.spark.sql.SQLContext
BaseRRDD<T,U> - Class in org.apache.spark.api.r
BaseRRDD(RDD<T>, int, byte[], String, String, byte[], Broadcast<Object>[], ClassTag<T>, ClassTag) - Constructor for class org.apache.spark.api.r.BaseRRDD
baseScope() - Method in class org.apache.spark.streaming.dstream.DStream: The base scope associated with the operation that created this DStream.
baseScope() - Method in class org.apache.spark.streaming.dstream.InputDStream: The base scope associated with the operation that created this DStream.
BATCHES() - Static method in class org.apache.spark.mllib.clustering.StreamingKMeans
BatchInfo - Class in org.apache.spark.streaming.scheduler: :: DeveloperApi :: Class having information on completed batches.
BatchInfo(Time, Map<Object, StreamInputInfo>, long, Option<Object>, Option<Object>, Map<Object, OutputOperationInfo>) - Constructor for class org.apache.spark.streaming.scheduler.BatchInfo
batchInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchCompleted
batchInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchStarted
batchInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchSubmitted
batchInfos() - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
batchTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
batchTime() - Method in class org.apache.spark.streaming.scheduler.OutputOperationInfo
bean(Class<T>) - Static method in class org.apache.spark.sql.Encoders: Creates an encoder for Java Bean of type T.
beforeFetch(Connection, Map<String, String>) - Method in class org.apache.spark.sql.jdbc.JdbcDialect: Override connection specific properties to run before a select is made.
beforeFetch(Connection, Map<String, String>) - Static method in class org.apache.spark.sql.jdbc.PostgresDialect
Bernoulli() - Static method in class org.apache.spark.mllib.classification.NaiveBayes
BernoulliCellSampler<T> - Class in org.apache.spark.util.random: :: DeveloperApi :: A sampler based on Bernoulli trials for partitioning a data sequence.
BernoulliCellSampler(double, double, boolean) - Constructor for class org.apache.spark.util.random.BernoulliCellSampler
BernoulliSampler<T> - Class in org.apache.spark.util.random: :: DeveloperApi :: A sampler based on Bernoulli trials.
BernoulliSampler(double, ClassTag<T>) - Constructor for class org.apache.spark.util.random.BernoulliSampler
bestModel() - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
bestModel() - Method in class org.apache.spark.ml.tuning.TrainValidationSplitModel
beta() - Method in class org.apache.spark.mllib.random.WeibullGenerator
between(Object, Object) - Method in class org.apache.spark.sql.Column: True if the current column is between the lower bound and upper bound, inclusive.
bin(Column) - Static method in class org.apache.spark.sql.functions: An expression that returns the string representation of the binary value of the given long column.
bin(String) - Static method in class org.apache.spark.sql.functions: An expression that returns the string representation of the binary value of the given long column.
Binarizer - Class in org.apache.spark.ml.feature: :: Experimental :: Binarize a column of continuous features given a threshold.
Binarizer(String) - Constructor for class org.apache.spark.ml.feature.Binarizer
Binarizer() - Constructor for class org.apache.spark.ml.feature.Binarizer
Binary() - Static method in class org.apache.spark.ml.attribute.AttributeType: Binary type.
binary() - Method in class org.apache.spark.sql.ColumnName: Creates a new StructField of type binary.
BinaryAttribute - Class in org.apache.spark.ml.attribute: :: DeveloperApi :: A binary attribute.
BinaryClassificationEvaluator - Class in org.apache.spark.ml.evaluation: :: Experimental :: Evaluator for binary classification, which expects two input columns: rawPrediction and label.
BinaryClassificationEvaluator(String) - Constructor for class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
BinaryClassificationEvaluator() - Constructor for class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
BinaryClassificationMetrics - Class in org.apache.spark.mllib.evaluation: Evaluator for binary classification.
BinaryClassificationMetrics(RDD<Tuple2<Object, Object>>, int) - Constructor for class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
BinaryClassificationMetrics(RDD<Tuple2<Object, Object>>) - Constructor for class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Defaults numBins to 0.
binaryFiles(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Read a directory of binary files from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI as a byte array.
binaryFiles(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Read a directory of binary files from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI as a byte array.
binaryFiles(String, int) - Method in class org.apache.spark.SparkContext: Get an RDD for a Hadoop-readable dataset as PortableDataStream for each file (useful for binary data)
binaryLabelValidator() - Static method in class org.apache.spark.mllib.util.DataValidators: Function to check if labels used for classification are either zero or one.
BinaryLogisticRegressionSummary - Class in org.apache.spark.ml.classification: :: Experimental :: Binary Logistic regression results for a given model.
BinaryLogisticRegressionTrainingSummary - Class in org.apache.spark.ml.classification: :: Experimental :: Logistic regression training results.
binaryRecords(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Load data from a flat binary file, assuming the length of each record is constant.
binaryRecords(String, int, Configuration) - Method in class org.apache.spark.SparkContext: Load data from a flat binary file, assuming the length of each record is constant.
binaryRecordsStream(String, int) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream that monitors a Hadoop-compatible filesystem for new files and reads them as flat binary files with fixed record lengths, yielding byte arrays
binaryRecordsStream(String, int) - Method in class org.apache.spark.streaming.StreamingContext: Create an input stream that monitors a Hadoop-compatible filesystem for new files and reads them as flat binary files, assuming a fixed length per record, generating one byte array per record.
BinarySample - Class in org.apache.spark.mllib.stat.test: Class that represents the group and value of a sample.
BinarySample(boolean, double) - Constructor for class org.apache.spark.mllib.stat.test.BinarySample
BinaryType - Class in org.apache.spark.sql.types: :: DeveloperApi :: The data type representing Array[Byte] values.
BinaryType - Static variable in class org.apache.spark.sql.types.DataTypes: Gets the BinaryType object.
BisectingKMeans - Class in org.apache.spark.mllib.clustering: A bisecting k-means algorithm based on the paper "A comparison of document clustering techniques" by Steinbach, Karypis, and Kumar, with modification to fit Spark.
BisectingKMeans() - Constructor for class org.apache.spark.mllib.clustering.BisectingKMeans: Constructs with the default configuration
BisectingKMeansModel - Class in org.apache.spark.mllib.clustering: Clustering model produced by BisectingKMeans.
bitwiseAND(Object) - Method in class org.apache.spark.sql.Column: Compute bitwise AND of this expression with another expression.
bitwiseNOT(Column) - Static method in class org.apache.spark.sql.functions: Computes bitwise NOT.
bitwiseOR(Object) - Method in class org.apache.spark.sql.Column: Compute bitwise OR of this expression with another expression.
bitwiseXOR(Object) - Method in class org.apache.spark.sql.Column: Compute bitwise XOR of this expression with another expression.
BlockId - Class in org.apache.spark.storage: :: DeveloperApi :: Identifies a particular Block of data, usually associated with a single file.
BlockId() - Constructor for class org.apache.spark.storage.BlockId
blockId() - Method in class org.apache.spark.storage.BlockUpdatedInfo
blockManager() - Method in class org.apache.spark.SparkEnv
blockManagerId() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
blockManagerId() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerRemoved
BlockManagerId - Class in org.apache.spark.storage: :: DeveloperApi :: This class represent an unique identifier for a BlockManager.
blockManagerId() - Method in class org.apache.spark.storage.BlockUpdatedInfo
blockManagerId() - Method in class org.apache.spark.storage.StorageStatus
blockManagerIdCache() - Static method in class org.apache.spark.storage.BlockManagerId
blockManagerIds() - Method in class org.apache.spark.ui.jobs.JobProgressListener
BlockMatrix - Class in org.apache.spark.mllib.linalg.distributed: Represents a distributed matrix in blocks of local matrices.
BlockMatrix(RDD<Tuple2<Tuple2<Object, Object>, Matrix>>, int, int, long, long) - Constructor for class org.apache.spark.mllib.linalg.distributed.BlockMatrix
BlockMatrix(RDD<Tuple2<Tuple2<Object, Object>, Matrix>>, int, int) - Constructor for class org.apache.spark.mllib.linalg.distributed.BlockMatrix: Alternate constructor for BlockMatrix without the input of the number of rows and columns.
blockName() - Method in class org.apache.spark.status.api.v1.RDDPartitionInfo
BlockNotFoundException - Exception in org.apache.spark.storage
BlockNotFoundException(String) - Constructor for exception org.apache.spark.storage.BlockNotFoundException
blockReplication() - Method in class org.apache.spark.sql.sources.HadoopFsRelation.FakeFileStatus
blocks() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
blocks() - Method in class org.apache.spark.storage.StorageStatus: Return the blocks stored in this block manager.
blockSize() - Method in class org.apache.spark.sql.sources.HadoopFsRelation.FakeFileStatus
BlockStatus - Class in org.apache.spark.storage
BlockStatus(StorageLevel, long, long, long) - Constructor for class org.apache.spark.storage.BlockStatus
blockTransferService() - Method in class org.apache.spark.SparkEnv
blockUpdatedInfo() - Method in class org.apache.spark.scheduler.SparkListenerBlockUpdated
BlockUpdatedInfo - Class in org.apache.spark.storage: :: DeveloperApi :: Stores information about a block status in a block manager.
BlockUpdatedInfo(BlockManagerId, BlockId, StorageLevel, long, long, long) - Constructor for class org.apache.spark.storage.BlockUpdatedInfo
bmAddress() - Method in class org.apache.spark.FetchFailed
BOOLEAN() - Static method in class org.apache.spark.sql.Encoders: An encoder for nullable boolean type.
BooleanParam - Class in org.apache.spark.ml.param: :: DeveloperApi :: Specialized version of Param[Boolean] for Java.
BooleanParam(String, String, String) - Constructor for class org.apache.spark.ml.param.BooleanParam
BooleanParam(Identifiable, String, String) - Constructor for class org.apache.spark.ml.param.BooleanParam
BooleanType - Class in org.apache.spark.sql.types: :: DeveloperApi :: The data type representing Boolean values.
BooleanType - Static variable in class org.apache.spark.sql.types.DataTypes: Gets the BooleanType object.
booleanWritableConverter() - Static method in class org.apache.spark.SparkContext
boolToBoolWritable(boolean) - Static method in class org.apache.spark.SparkContext
BoostingStrategy - Class in org.apache.spark.mllib.tree.configuration: Configuration options for GradientBoostedTrees.
BoostingStrategy(Strategy, Loss, int, double, double) - Constructor for class org.apache.spark.mllib.tree.configuration.BoostingStrategy
Both() - Static method in class org.apache.spark.graphx.EdgeDirection: Edges originating from *and* arriving at a vertex of interest.
boundaries() - Method in class org.apache.spark.ml.regression.IsotonicRegressionModel: Boundaries in increasing order for which predictions are known.
boundaries() - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
BoundedDouble - Class in org.apache.spark.partial: A Double value with error bars and associated confidence.
BoundedDouble(double, double, double, double) - Constructor for class org.apache.spark.partial.BoundedDouble
boundTEncoder() - Method in class org.apache.spark.sql.Dataset: The encoder where the expressions used to construct an object from an input row have been bound to the ordinals of the given schema.
broadcast(T) - Method in class org.apache.spark.api.java.JavaSparkContext: Broadcast a read-only variable to the cluster, returning a Broadcast object for reading it in distributed functions.
Broadcast<T> - Class in org.apache.spark.broadcast: A broadcast variable.
Broadcast(long, ClassTag<T>) - Constructor for class org.apache.spark.broadcast.Broadcast
broadcast(T, ClassTag<T>) - Method in class org.apache.spark.SparkContext: Broadcast a read-only variable to the cluster, returning a Broadcast object for reading it in distributed functions.
broadcast(DataFrame) - Static method in class org.apache.spark.sql.functions: Marks a DataFrame as small enough for use in broadcast joins.
BROADCAST() - Static method in class org.apache.spark.storage.BlockId
BroadcastBlockId - Class in org.apache.spark.storage
BroadcastBlockId(long, String) - Constructor for class org.apache.spark.storage.BroadcastBlockId
BroadcastFactory - Interface in org.apache.spark.broadcast: :: DeveloperApi :: An interface for all the broadcast implementations in Spark (to allow multiple broadcast implementations).
broadcastId() - Method in class org.apache.spark.CleanBroadcast
broadcastId() - Method in class org.apache.spark.storage.BroadcastBlockId
broadcastManager() - Method in class org.apache.spark.SparkEnv
Broker - Class in org.apache.spark.streaming.kafka: Represents the host and port info for a Kafka broker.
Bucketizer - Class in org.apache.spark.ml.feature: :: Experimental :: Bucketizer maps a column of continuous features to a column of feature buckets.
Bucketizer(String) - Constructor for class org.apache.spark.ml.feature.Bucketizer
Bucketizer() - Constructor for class org.apache.spark.ml.feature.Bucketizer
BufferReleasingInputStream - Class in org.apache.spark.storage: Helper class that ensures a ManagedBuffer is release upon InputStream.close()
BufferReleasingInputStream(InputStream, ShuffleBlockFetcherIterator) - Constructor for class org.apache.spark.storage.BufferReleasingInputStream
bufferSchema() - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction: A StructType represents data types of values in the aggregation buffer.
build() - Method in class org.apache.spark.ml.tuning.ParamGridBuilder: Builds and returns all combinations of parameters specified by the param grid.
build(Node[]) - Method in class org.apache.spark.mllib.tree.model.Node: build the left node and right nodes if not leaf
build() - Method in class org.apache.spark.sql.types.MetadataBuilder: Builds the Metadata instance.
buildFormattedString(DataType, String, StringBuilder) - Static method in class org.apache.spark.sql.types.DataType
buildJobStageDependencies(int, Seq<Object>) - Method in class org.apache.spark.scheduler.JobLogger: Build up the maps that represent stage-job relationships
buildScan(Seq<Attribute>, Seq<Expression>) - Method in interface org.apache.spark.sql.sources.CatalystScan
buildScan(FileStatus[]) - Method in class org.apache.spark.sql.sources.HadoopFsRelation: For a non-partitioned relation, this method builds an RDD[Row] containing all rows within this relation.
buildScan(String[], FileStatus[]) - Method in class org.apache.spark.sql.sources.HadoopFsRelation: For a non-partitioned relation, this method builds an RDD[Row] containing all rows within this relation.
buildScan(String[], Filter[], FileStatus[]) - Method in class org.apache.spark.sql.sources.HadoopFsRelation: For a non-partitioned relation, this method builds an RDD[Row] containing all rows within this relation.
buildScan(String[], Filter[]) - Method in interface org.apache.spark.sql.sources.PrunedFilteredScan
buildScan(String[]) - Method in interface org.apache.spark.sql.sources.PrunedScan
buildScan() - Method in interface org.apache.spark.sql.sources.TableScan
BYTE() - Static method in class org.apache.spark.sql.Encoders: An encoder for nullable byte type.
ByteDecimal() - Static method in class org.apache.spark.sql.types.DecimalType
bytesRead() - Method in class org.apache.spark.status.api.v1.InputMetricDistributions
bytesRead() - Method in class org.apache.spark.status.api.v1.InputMetrics
bytesToBytesWritable(byte[]) - Static method in class org.apache.spark.SparkContext
bytesWritableConverter() - Static method in class org.apache.spark.SparkContext
bytesWritten() - Method in class org.apache.spark.status.api.v1.OutputMetricDistributions
bytesWritten() - Method in class org.apache.spark.status.api.v1.OutputMetrics
bytesWritten() - Method in class org.apache.spark.status.api.v1.ShuffleWriteMetrics
ByteType - Class in org.apache.spark.sql.types: :: DeveloperApi :: The data type representing Byte values.
ByteType - Static variable in class org.apache.spark.sql.types.DataTypes: Gets the ByteType object.

C

cache() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Persist this RDD with the default storage level (`MEMORY_ONLY`).
cache() - Method in class org.apache.spark.api.java.JavaPairRDD: Persist this RDD with the default storage level (`MEMORY_ONLY`).
cache() - Method in class org.apache.spark.api.java.JavaRDD: Persist this RDD with the default storage level (`MEMORY_ONLY`).
cache() - Method in class org.apache.spark.graphx.Graph: Caches the vertices and edges associated with this graph at the previously-specified target storage levels, which default to MEMORY_ONLY.
cache() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl: Persists the edge partitions using `targetStorageLevel`, which defaults to MEMORY_ONLY.
cache() - Method in class org.apache.spark.graphx.impl.GraphImpl
cache() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl: Persists the vertex partitions at `targetStorageLevel`, which defaults to MEMORY_ONLY.
cache() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix: Caches the underlying RDD.
cache() - Method in class org.apache.spark.rdd.RDD: Persist this RDD with the default storage level (`MEMORY_ONLY`).
cache() - Method in class org.apache.spark.sql.DataFrame: Persist this DataFrame with the default storage level (MEMORY_AND_DISK).
cache() - Method in class org.apache.spark.sql.Dataset: Persist this Dataset with the default storage level (MEMORY_AND_DISK).
cache() - Method in class org.apache.spark.streaming.api.java.JavaDStream: Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
cache() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
cache() - Method in class org.apache.spark.streaming.dstream.DStream: Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
cachedLeafStatuses() - Method in class org.apache.spark.sql.sources.HadoopFsRelation
cacheManager() - Method in class org.apache.spark.SparkEnv
cacheManager() - Method in class org.apache.spark.sql.SQLContext
cacheTable(String) - Method in class org.apache.spark.sql.SQLContext: Caches the specified table in-memory.
calculate(DenseVector<Object>) - Method in class org.apache.spark.ml.classification.LogisticCostFun
calculate(DenseVector<Object>) - Method in class org.apache.spark.ml.regression.AFTCostFun
calculate(DenseVector<Object>) - Method in class org.apache.spark.ml.regression.LeastSquaresCostFun
calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Entropy: :: DeveloperApi :: information calculation for multiclass classification
calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Entropy: :: DeveloperApi :: variance calculation
calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Gini: :: DeveloperApi :: information calculation for multiclass classification
calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Gini: :: DeveloperApi :: variance calculation
calculate(double[], double) - Method in interface org.apache.spark.mllib.tree.impurity.Impurity: :: DeveloperApi :: information calculation for multiclass classification
calculate(double, double, double) - Method in interface org.apache.spark.mllib.tree.impurity.Impurity: :: DeveloperApi :: information calculation for regression
calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Variance: :: DeveloperApi :: information calculation for multiclass classification
calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Variance: :: DeveloperApi :: variance calculation
CalendarIntervalType - Class in org.apache.spark.sql.types: :: DeveloperApi :: The data type representing calendar time intervals.
CalendarIntervalType - Static variable in class org.apache.spark.sql.types.DataTypes: Gets the CalendarIntervalType object.
call(K, Iterator<V1>, Iterator<V2>) - Method in interface org.apache.spark.api.java.function.CoGroupFunction
call(T) - Method in interface org.apache.spark.api.java.function.DoubleFlatMapFunction
call(T) - Method in interface org.apache.spark.api.java.function.DoubleFunction
call(T) - Method in interface org.apache.spark.api.java.function.FilterFunction
call(T) - Method in interface org.apache.spark.api.java.function.FlatMapFunction
call(T1, T2) - Method in interface org.apache.spark.api.java.function.FlatMapFunction2
call(K, Iterator<V>) - Method in interface org.apache.spark.api.java.function.FlatMapGroupsFunction
call(T) - Method in interface org.apache.spark.api.java.function.ForeachFunction
call(Iterator<T>) - Method in interface org.apache.spark.api.java.function.ForeachPartitionFunction
call(T1) - Method in interface org.apache.spark.api.java.function.Function
call() - Method in interface org.apache.spark.api.java.function.Function0
call(T1, T2) - Method in interface org.apache.spark.api.java.function.Function2
call(T1, T2, T3) - Method in interface org.apache.spark.api.java.function.Function3
call(T1, T2, T3, T4) - Method in interface org.apache.spark.api.java.function.Function4
call(T) - Method in interface org.apache.spark.api.java.function.MapFunction
call(K, Iterator<V>) - Method in interface org.apache.spark.api.java.function.MapGroupsFunction
call(Iterator<T>) - Method in interface org.apache.spark.api.java.function.MapPartitionsFunction
call(T) - Method in interface org.apache.spark.api.java.function.PairFlatMapFunction
call(T) - Method in interface org.apache.spark.api.java.function.PairFunction
call(T, T) - Method in interface org.apache.spark.api.java.function.ReduceFunction
call(T) - Method in interface org.apache.spark.api.java.function.VoidFunction
call(T1, T2) - Method in interface org.apache.spark.api.java.function.VoidFunction2
call(T1) - Method in interface org.apache.spark.sql.api.java.UDF1
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10) - Method in interface org.apache.spark.sql.api.java.UDF10
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11) - Method in interface org.apache.spark.sql.api.java.UDF11
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12) - Method in interface org.apache.spark.sql.api.java.UDF12
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13) - Method in interface org.apache.spark.sql.api.java.UDF13
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14) - Method in interface org.apache.spark.sql.api.java.UDF14
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15) - Method in interface org.apache.spark.sql.api.java.UDF15
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16) - Method in interface org.apache.spark.sql.api.java.UDF16
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17) - Method in interface org.apache.spark.sql.api.java.UDF17
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18) - Method in interface org.apache.spark.sql.api.java.UDF18
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19) - Method in interface org.apache.spark.sql.api.java.UDF19
call(T1, T2) - Method in interface org.apache.spark.sql.api.java.UDF2
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20) - Method in interface org.apache.spark.sql.api.java.UDF20
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20, T21) - Method in interface org.apache.spark.sql.api.java.UDF21
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20, T21, T22) - Method in interface org.apache.spark.sql.api.java.UDF22
call(T1, T2, T3) - Method in interface org.apache.spark.sql.api.java.UDF3
call(T1, T2, T3, T4) - Method in interface org.apache.spark.sql.api.java.UDF4
call(T1, T2, T3, T4, T5) - Method in interface org.apache.spark.sql.api.java.UDF5
call(T1, T2, T3, T4, T5, T6) - Method in interface org.apache.spark.sql.api.java.UDF6
call(T1, T2, T3, T4, T5, T6, T7) - Method in interface org.apache.spark.sql.api.java.UDF7
call(T1, T2, T3, T4, T5, T6, T7, T8) - Method in interface org.apache.spark.sql.api.java.UDF8
call(T1, T2, T3, T4, T5, T6, T7, T8, T9) - Method in interface org.apache.spark.sql.api.java.UDF9
callSite() - Method in class org.apache.spark.storage.RDDInfo
callUDF(String, Column...) - Static method in class org.apache.spark.sql.functions: Call an user-defined function.
callUDF(Function0<?>, DataType) - Static method in class org.apache.spark.sql.functions: Deprecated.
As of 1.5.0, since it's redundant with udf() This will be removed in Spark 2.0.
callUDF(Function1<?, ?>, DataType, Column) - Static method in class org.apache.spark.sql.functions: Deprecated.
As of 1.5.0, since it's redundant with udf() This will be removed in Spark 2.0.
callUDF(Function2<?, ?, ?>, DataType, Column, Column) - Static method in class org.apache.spark.sql.functions: Deprecated.
As of 1.5.0, since it's redundant with udf() This will be removed in Spark 2.0.
callUDF(Function3<?, ?, ?, ?>, DataType, Column, Column, Column) - Static method in class org.apache.spark.sql.functions: Deprecated.
As of 1.5.0, since it's redundant with udf() This will be removed in Spark 2.0.
callUDF(Function4<?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions: Deprecated.
As of 1.5.0, since it's redundant with udf() This will be removed in Spark 2.0.
callUDF(Function5<?, ?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions: Deprecated.
As of 1.5.0, since it's redundant with udf() This will be removed in Spark 2.0.
callUDF(Function6<?, ?, ?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions: Deprecated.
As of 1.5.0, since it's redundant with udf() This will be removed in Spark 2.0.
callUDF(Function7<?, ?, ?, ?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions: Deprecated.
As of 1.5.0, since it's redundant with udf() This will be removed in Spark 2.0.
callUDF(Function8<?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions: Deprecated.
As of 1.5.0, since it's redundant with udf() This will be removed in Spark 2.0.
callUDF(Function9<?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column, Column, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions: Deprecated.
As of 1.5.0, since it's redundant with udf(). This will be removed in Spark 2.0.
callUDF(Function10<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column, Column, Column, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions: Deprecated.
As of 1.5.0, since it's redundant with udf(). This will be removed in Spark 2.0.
callUDF(String, Seq<Column>) - Static method in class org.apache.spark.sql.functions: Call an user-defined function.
callUdf(String, Seq<Column>) - Static method in class org.apache.spark.sql.functions: Deprecated.
As of 1.5.0, since it was not coherent to have two functions callUdf and callUDF. This will be removed in Spark 2.0.
cancel() - Method in class org.apache.spark.ComplexFutureAction
cancel() - Method in interface org.apache.spark.FutureAction: Cancels the execution of this action.
cancel() - Method in class org.apache.spark.SimpleFutureAction
cancelAllJobs() - Method in class org.apache.spark.api.java.JavaSparkContext: Cancel all jobs that have been scheduled or are running.
cancelAllJobs() - Method in class org.apache.spark.SparkContext: Cancel all jobs that have been scheduled or are running.
cancelJobGroup(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Cancel active jobs for the specified group.
cancelJobGroup(String) - Method in class org.apache.spark.SparkContext: Cancel active jobs for the specified group.
canEqual(Object) - Method in class org.apache.spark.scheduler.cluster.ExecutorInfo
canEqual(Object) - Method in class org.apache.spark.util.MutablePair
canHandle(String) - Method in class org.apache.spark.sql.jdbc.AggregatedDialect
canHandle(String) - Static method in class org.apache.spark.sql.jdbc.DB2Dialect
canHandle(String) - Static method in class org.apache.spark.sql.jdbc.DerbyDialect
canHandle(String) - Method in class org.apache.spark.sql.jdbc.JdbcDialect: Check if this dialect instance can handle a certain jdbc url.
canHandle(String) - Static method in class org.apache.spark.sql.jdbc.MsSqlServerDialect
canHandle(String) - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
canHandle(String) - Static method in class org.apache.spark.sql.jdbc.NoopDialect
canHandle(String) - Static method in class org.apache.spark.sql.jdbc.OracleDialect
canHandle(String) - Static method in class org.apache.spark.sql.jdbc.PostgresDialect
cartesian(JavaRDDLike<U, ?>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of elements (a, b) where a is in this and b is in other.
cartesian(RDD, ClassTag) - Method in class org.apache.spark.rdd.RDD: Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of elements (a, b) where a is in this and b is in other.
caseSensitive() - Method in class org.apache.spark.ml.feature.StopWordsRemover: whether to do a case sensitive comparison over the stop words Default: false
cast(DataType) - Method in class org.apache.spark.sql.Column: Casts the column to a different data type.
cast(String) - Method in class org.apache.spark.sql.Column: Casts the column to a different data type, using the canonical string representation of the type.
catalog() - Method in class org.apache.spark.sql.hive.HiveContext
catalog() - Method in class org.apache.spark.sql.SQLContext
CatalystScan - Interface in org.apache.spark.sql.sources: ::Experimental:: An interface for experimenting with a more direct connection to the query planner.
Categorical() - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
categoricalFeaturesInfo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
CategoricalSplit - Class in org.apache.spark.ml.tree: :: DeveloperApi :: Split which tests a categorical feature.
categories() - Method in class org.apache.spark.mllib.tree.model.Split
categoryMaps() - Method in class org.apache.spark.ml.feature.VectorIndexerModel
cbrt(Column) - Static method in class org.apache.spark.sql.functions: Computes the cube-root of the given value.
cbrt(String) - Static method in class org.apache.spark.sql.functions: Computes the cube-root of the given column.
ceil(Column) - Static method in class org.apache.spark.sql.functions: Computes the ceiling of the given value.
ceil(String) - Static method in class org.apache.spark.sql.functions: Computes the ceiling of the given column.
ceil() - Method in class org.apache.spark.sql.types.Decimal
changePrecision(int, int) - Method in class org.apache.spark.sql.types.Decimal: Update precision and scale while keeping our value the same, and return true if successful.
checkpoint() - Method in interface org.apache.spark.api.java.JavaRDDLike: Mark this RDD for checkpointing.
checkpoint() - Method in class org.apache.spark.graphx.Graph: Mark this Graph for checkpointing.
checkpoint() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
checkpoint() - Method in class org.apache.spark.graphx.impl.GraphImpl
checkpoint() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
checkpoint() - Method in class org.apache.spark.rdd.HadoopRDD
checkpoint() - Method in class org.apache.spark.rdd.RDD: Mark this RDD for checkpointing.
checkpoint(Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Enable periodic checkpointing of RDDs of this DStream.
checkpoint(String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Sets the context to periodically checkpoint the DStream operations for master fault-tolerance.
checkpoint(Duration) - Method in class org.apache.spark.streaming.dstream.DStream: Enable periodic checkpointing of RDDs of this DStream
checkpoint(String) - Method in class org.apache.spark.streaming.StreamingContext: Set the context to periodically checkpoint the DStream operations for driver fault-tolerance.
checkpointData() - Method in class org.apache.spark.rdd.RDD
checkpointData() - Method in class org.apache.spark.streaming.dstream.DStream
checkpointDir() - Method in class org.apache.spark.SparkContext
checkpointDir() - Method in class org.apache.spark.streaming.StreamingContext
checkpointDuration() - Method in class org.apache.spark.streaming.dstream.DStream
checkpointDuration() - Method in class org.apache.spark.streaming.StreamingContext
checkpointFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext
checkpointFile(String, ClassTag<T>) - Method in class org.apache.spark.SparkContext
checkpointInterval() - Method in class org.apache.spark.mllib.clustering.EMLDAOptimizer
checkpointInterval() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
child() - Method in class org.apache.spark.sql.sources.Not
CHILD_CONNECTION_TIMEOUT - Static variable in class org.apache.spark.launcher.SparkLauncher: Maximum time (in ms) to wait for a child process to connect back to the launcher server when using @link{#start()}.
CHILD_PROCESS_LOGGER_NAME - Static variable in class org.apache.spark.launcher.SparkLauncher: Logger name to use when launching a child process.
ChiSqSelector - Class in org.apache.spark.ml.feature: :: Experimental :: Chi-Squared feature selection, which selects categorical features to use for predicting a categorical label.
ChiSqSelector(String) - Constructor for class org.apache.spark.ml.feature.ChiSqSelector
ChiSqSelector() - Constructor for class org.apache.spark.ml.feature.ChiSqSelector
ChiSqSelector - Class in org.apache.spark.mllib.feature
ChiSqSelector(int) - Constructor for class org.apache.spark.mllib.feature.ChiSqSelector
ChiSqSelectorModel - Class in org.apache.spark.ml.feature
ChiSqSelectorModel - Class in org.apache.spark.mllib.feature: Chi Squared selector model.
ChiSqSelectorModel(int[]) - Constructor for class org.apache.spark.mllib.feature.ChiSqSelectorModel
chiSqTest(Vector, Vector) - Static method in class org.apache.spark.mllib.stat.Statistics: Conduct Pearson's chi-squared goodness of fit test of the observed data against the expected distribution.
chiSqTest(Vector) - Static method in class org.apache.spark.mllib.stat.Statistics: Conduct Pearson's chi-squared goodness of fit test of the observed data against the uniform distribution, with each category having an expected frequency of 1 / observed.size.
chiSqTest(Matrix) - Static method in class org.apache.spark.mllib.stat.Statistics: Conduct Pearson's independence test on the input contingency matrix, which cannot contain negative entries or columns or rows that sum up to 0.
chiSqTest(RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.stat.Statistics: Conduct Pearson's independence test for every feature against the label across the input RDD.
chiSqTest(JavaRDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.stat.Statistics: Java-friendly version of chiSqTest()
ChiSqTestResult - Class in org.apache.spark.mllib.stat.test: Object containing the test results for the chi-squared hypothesis test.
Classification() - Static method in class org.apache.spark.mllib.tree.configuration.Algo
ClassificationModel<FeaturesType,M extends ClassificationModel<FeaturesType,M>> - Class in org.apache.spark.ml.classification: :: DeveloperApi ::
ClassificationModel() - Constructor for class org.apache.spark.ml.classification.ClassificationModel
ClassificationModel - Interface in org.apache.spark.mllib.classification: Represents a classification model that predicts to which of a set of categories an example belongs.
Classifier<FeaturesType,E extends Classifier<FeaturesType,E,M>,M extends ClassificationModel<FeaturesType,M>> - Class in org.apache.spark.ml.classification: :: DeveloperApi ::
Classifier() - Constructor for class org.apache.spark.ml.classification.Classifier
className() - Method in class org.apache.spark.ExceptionFailure
classpathEntries() - Method in class org.apache.spark.ui.env.EnvironmentListener
classTag() - Method in class org.apache.spark.api.java.JavaDoubleRDD
classTag() - Method in class org.apache.spark.api.java.JavaPairRDD
classTag() - Method in class org.apache.spark.api.java.JavaRDD
classTag() - Method in interface org.apache.spark.api.java.JavaRDDLike
classTag() - Method in class org.apache.spark.streaming.api.java.JavaDStream
classTag() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
classTag() - Method in class org.apache.spark.streaming.api.java.JavaInputDStream
classTag() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
classTag() - Method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
clean(long, boolean) - Method in class org.apache.spark.streaming.util.WriteAheadLog: Clean all the records that are older than the threshold time.
CleanAccum - Class in org.apache.spark
CleanAccum(long) - Constructor for class org.apache.spark.CleanAccum
CleanBroadcast - Class in org.apache.spark
CleanBroadcast(long) - Constructor for class org.apache.spark.CleanBroadcast
CleanCheckpoint - Class in org.apache.spark
CleanCheckpoint(int) - Constructor for class org.apache.spark.CleanCheckpoint
CleanRDD - Class in org.apache.spark
CleanRDD(int) - Constructor for class org.apache.spark.CleanRDD
CleanShuffle - Class in org.apache.spark
CleanShuffle(int) - Constructor for class org.apache.spark.CleanShuffle
CleanupTask - Interface in org.apache.spark: Classes that represent cleaning tasks.
CleanupTaskWeakReference - Class in org.apache.spark: A WeakReference associated with a CleanupTask.
CleanupTaskWeakReference(CleanupTask, Object, ReferenceQueue<Object>) - Constructor for class org.apache.spark.CleanupTaskWeakReference
clear(Param<?>) - Method in interface org.apache.spark.ml.param.Params
clear() - Method in class org.apache.spark.sql.util.ExecutionListenerManager: Removes all the registered QueryExecutionListener.
clearActive() - Static method in class org.apache.spark.sql.SQLContext: Clears the active SQLContext for current thread.
clearCache() - Method in class org.apache.spark.sql.SQLContext: Removes all cached tables from the in-memory cache.
clearCallSite() - Method in class org.apache.spark.api.java.JavaSparkContext: Pass-through to SparkContext.setCallSite.
clearCallSite() - Method in class org.apache.spark.SparkContext: Clear the thread-local property for overriding the call sites of actions and RDDs.
clearDependencies() - Method in class org.apache.spark.rdd.CoGroupedRDD
clearDependencies() - Method in class org.apache.spark.rdd.RDD: Clears the dependencies of this RDD.
clearDependencies() - Method in class org.apache.spark.rdd.ShuffledRDD
clearDependencies() - Method in class org.apache.spark.rdd.UnionRDD
clearFiles() - Method in class org.apache.spark.api.java.JavaSparkContext: Clear the job's list of files added by addFile so that they do not get downloaded to any new nodes.
clearFiles() - Method in class org.apache.spark.SparkContext: Clear the job's list of files added by addFile so that they do not get downloaded to any new nodes.
clearJars() - Method in class org.apache.spark.api.java.JavaSparkContext: Clear the job's list of JARs added by addJar so that they do not get downloaded to any new nodes.
clearJars() - Method in class org.apache.spark.SparkContext: Clear the job's list of JARs added by addJar so that they do not get downloaded to any new nodes.
clearJobGroup() - Method in class org.apache.spark.api.java.JavaSparkContext: Clear the current thread's job group ID and its description.
clearJobGroup() - Method in class org.apache.spark.SparkContext: Clear the current thread's job group ID and its description.
clearThreshold() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel: Clears the threshold so that predict will output raw prediction scores.
clearThreshold() - Method in class org.apache.spark.mllib.classification.SVMModel: Clears the threshold so that predict will output raw prediction scores.
clone() - Method in class org.apache.spark.SparkConf: Copy this object
clone() - Method in class org.apache.spark.sql.types.Decimal
clone() - Method in class org.apache.spark.storage.StorageLevel
clone() - Method in class org.apache.spark.util.random.BernoulliCellSampler
clone() - Method in class org.apache.spark.util.random.BernoulliSampler
clone() - Method in class org.apache.spark.util.random.PoissonSampler
clone() - Method in interface org.apache.spark.util.random.RandomSampler: return a copy of the RandomSampler object
cloneComplement() - Method in class org.apache.spark.util.random.BernoulliCellSampler: Return a sampler that is the complement of the range specified of the current sampler.
close() - Method in class org.apache.spark.api.java.JavaSparkContext
close() - Method in class org.apache.spark.input.PortableDataStream: Closing the PortableDataStream is not needed anymore.
close() - Method in class org.apache.spark.io.SnappyOutputStreamWrapper
close() - Method in class org.apache.spark.serializer.DeserializationStream
close() - Method in class org.apache.spark.serializer.SerializationStream
close() - Method in class org.apache.spark.sql.sources.OutputWriter: Closes the OutputWriter.
close() - Method in class org.apache.spark.storage.BufferReleasingInputStream
close() - Method in class org.apache.spark.storage.TimeTrackingOutputStream
close() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
close() - Method in class org.apache.spark.streaming.util.WriteAheadLog: Close this log and release any resources.
closeLogWriter(int) - Method in class org.apache.spark.scheduler.JobLogger: Close log file, and clean the stage relationship in stageIdToJobId
closureSerializer() - Method in class org.apache.spark.SparkEnv
cls() - Method in class org.apache.spark.util.MethodIdentifier
clsTag() - Method in interface org.apache.spark.sql.Encoder: A ClassTag that can be used to construct and Array to contain a collection of `T`.
cluster() - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering.Assignment
clusterCenters() - Method in class org.apache.spark.ml.clustering.KMeansModel
clusterCenters() - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel: Leaf cluster centers.
clusterCenters() - Method in class org.apache.spark.mllib.clustering.KMeansModel
clusterCenters() - Method in class org.apache.spark.mllib.clustering.StreamingKMeansModel
clusterWeights() - Method in class org.apache.spark.mllib.clustering.StreamingKMeansModel
cn() - Method in class org.apache.spark.mllib.feature.VocabWord
coalesce(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a new RDD that is reduced into numPartitions partitions.
coalesce(int) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a new RDD that is reduced into numPartitions partitions.
coalesce(int) - Method in class org.apache.spark.api.java.JavaRDD: Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaRDD: Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean, Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Return a new RDD that is reduced into numPartitions partitions.
coalesce(int) - Method in class org.apache.spark.sql.DataFrame: Returns a new DataFrame that has exactly numPartitions partitions.
coalesce(int) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset that has exactly numPartitions partitions.
coalesce(Column...) - Static method in class org.apache.spark.sql.functions: Returns the first column that is not null, or null if all inputs are null.
coalesce(Seq<Column>) - Static method in class org.apache.spark.sql.functions: Returns the first column that is not null, or null if all inputs are null.
code() - Method in class org.apache.spark.mllib.feature.VocabWord
codeLen() - Method in class org.apache.spark.mllib.feature.VocabWord
coefficients() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
coefficients() - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
coefficients() - Method in class org.apache.spark.ml.regression.LinearRegressionModel
coefficientStandardErrors() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary: Standard error of estimated coefficients and intercept.
cogroup(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions: For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
cogroup(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(GroupedDataset<K, U>, Function3<K, Iterator<V>, Iterator, TraversableOnce<R>>, Encoder<R>) - Method in class org.apache.spark.sql.GroupedDataset: Applies the given function to each cogrouped data.
cogroup(GroupedDataset<K, U>, CoGroupFunction<K, V, U, R>, Encoder<R>) - Method in class org.apache.spark.sql.GroupedDataset: Applies the given function to each cogrouped data.
cogroup(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
CoGroupedRDD<K> - Class in org.apache.spark.rdd: :: DeveloperApi :: A RDD that cogroups its parents.
CoGroupedRDD(Seq<RDD<? extends Product2<K, ?>>>, Partitioner, ClassTag<K>) - Constructor for class org.apache.spark.rdd.CoGroupedRDD
CoGroupFunction<K,V1,V2,R> - Interface in org.apache.spark.api.java.function: A function that returns zero or more output records from each grouping key and its values from 2 Datasets.
col(String) - Method in class org.apache.spark.sql.DataFrame: Selects column based on the column name and return it as a Column.
col(String) - Static method in class org.apache.spark.sql.functions: Returns a Column based on the given column name.
collect() - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an array that contains all of the elements in this RDD.
collect() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
collect() - Method in class org.apache.spark.rdd.RDD: Return an array that contains all of the elements in this RDD.
collect(PartialFunction<T, U>, ClassTag) - Method in class org.apache.spark.rdd.RDD: Return an RDD that contains all matching values by applying f.
collect() - Method in class org.apache.spark.sql.DataFrame: Returns an array that contains all of Rows in this DataFrame.
collect() - Method in class org.apache.spark.sql.Dataset: Returns an array that contains all the elements in this Dataset.
collect_list(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns a list of objects with duplicates.
collect_list(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns a list of objects with duplicates.
collect_set(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns a set of objects with duplicate elements eliminated.
collect_set(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns a set of objects with duplicate elements eliminated.
collectAsList() - Method in class org.apache.spark.sql.DataFrame: Returns a Java list that contains all of Rows in this DataFrame.
collectAsList() - Method in class org.apache.spark.sql.Dataset: Returns an array that contains all the elements in this Dataset.
collectAsMap() - Method in class org.apache.spark.api.java.JavaPairRDD: Return the key-value pairs in this RDD to the master as a Map.
collectAsMap() - Method in class org.apache.spark.rdd.PairRDDFunctions: Return the key-value pairs in this RDD to the master as a Map.
collectAsync() - Method in interface org.apache.spark.api.java.JavaRDDLike: The asynchronous version of collect, which returns a future for retrieving an array containing all of the elements in this RDD.
collectAsync() - Method in class org.apache.spark.rdd.AsyncRDDActions: Returns a future for retrieving all elements of this RDD.
collectEdges(EdgeDirection) - Method in class org.apache.spark.graphx.GraphOps: Returns an RDD that contains for each vertex v its local edges, i.e., the edges that are incident on v, in the user-specified direction.
collectNeighborIds(EdgeDirection) - Method in class org.apache.spark.graphx.GraphOps: Collect the neighbor vertex ids for each vertex.
collectNeighbors(EdgeDirection) - Method in class org.apache.spark.graphx.GraphOps: Collect the neighbor vertex attributes for each vertex.
collectPartitions(int[]) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an array that contains all of the elements in a specific partition of this RDD.
collectToPython() - Method in class org.apache.spark.sql.DataFrame
colPtrs() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
colsPerBlock() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
colStats(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.Statistics: Computes column-wise summary statistics for the input RDD[Vector].
Column - Class in org.apache.spark.sql: :: Experimental :: A column that will be computed based on the data in a DataFrame.
Column(Expression) - Constructor for class org.apache.spark.sql.Column
Column(String) - Constructor for class org.apache.spark.sql.Column
column(String) - Static method in class org.apache.spark.sql.functions: Returns a Column based on the given column name.
ColumnName - Class in org.apache.spark.sql: :: Experimental :: A convenient class used for constructing schema.
ColumnName(String) - Constructor for class org.apache.spark.sql.ColumnName
ColumnPruner - Class in org.apache.spark.ml.feature: Utility transformer for removing temporary columns from a DataFrame.
ColumnPruner(Set<String>) - Constructor for class org.apache.spark.ml.feature.ColumnPruner
columns() - Method in class org.apache.spark.sql.DataFrame: Returns all column names as an array.
columnSimilarities() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
columnSimilarities() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Compute all cosine similarities between columns of this matrix using the brute-force approach of computing normalized dot products.
columnSimilarities(double) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Compute similarities between columns of this matrix using a sampling approach.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, Serializer) - Method in class org.apache.spark.api.java.JavaPairRDD: Generic function to combine the elements for each key using a custom set of aggregation functions.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Generic function to combine the elements for each key using a custom set of aggregation functions.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Simplified version of combineByKey that hash-partitions the output RDD and uses map-side aggregation.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Method in class org.apache.spark.api.java.JavaPairRDD: Simplified version of combineByKey that hash-partitions the resulting RDD using the existing partitioner/parallelism level and using map-side aggregation.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, Serializer) - Method in class org.apache.spark.rdd.PairRDDFunctions: Generic function to combine the elements for each key using a custom set of aggregation functions.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: Simplified version of combineByKeyWithClassTag that hash-partitions the output RDD.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Method in class org.apache.spark.rdd.PairRDDFunctions
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Combine elements of each key in DStream's RDDs using custom function.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Combine elements of each key in DStream's RDDs using custom function.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, ClassTag<C>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Combine elements of each key in DStream's RDDs using custom functions.
combineByKeyWithClassTag(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, Serializer, ClassTag<C>) - Method in class org.apache.spark.rdd.PairRDDFunctions: :: Experimental :: Generic function to combine the elements for each key using a custom set of aggregation functions.
combineByKeyWithClassTag(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, int, ClassTag<C>) - Method in class org.apache.spark.rdd.PairRDDFunctions: :: Experimental :: Simplified version of combineByKeyWithClassTag that hash-partitions the output RDD.
combineByKeyWithClassTag(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, ClassTag<C>) - Method in class org.apache.spark.rdd.PairRDDFunctions: :: Experimental :: Simplified version of combineByKeyWithClassTag that hash-partitions the resulting RDD using the existing partitioner/parallelism level.
combineCombinersByKey(Iterator<Product2<K, C>>) - Method in class org.apache.spark.Aggregator
combineCombinersByKey(Iterator<Product2<K, C>>, TaskContext) - Method in class org.apache.spark.Aggregator
combinerClassName() - Method in class org.apache.spark.ShuffleDependency
combineValuesByKey(Iterator<Product2<K, V>>) - Method in class org.apache.spark.Aggregator
combineValuesByKey(Iterator<Product2<K, V>>, TaskContext) - Method in class org.apache.spark.Aggregator
compare(PartitionGroup, PartitionGroup) - Method in class org.apache.spark.rdd.PartitionCoalescer
compare(Option<PartitionGroup>, Option<PartitionGroup>) - Method in class org.apache.spark.rdd.PartitionCoalescer
compare(Decimal) - Method in class org.apache.spark.sql.types.Decimal
compare(RDDInfo) - Method in class org.apache.spark.storage.RDDInfo
compareTo(SparkShutdownHook) - Method in class org.apache.spark.util.SparkShutdownHook
completed() - Method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
completedJobs() - Method in class org.apache.spark.ui.jobs.JobProgressListener
completedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
completedTasks() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
completionTime() - Method in class org.apache.spark.scheduler.StageInfo: Time when all tasks in the stage completed or when the stage was cancelled.
completionTime() - Method in class org.apache.spark.status.api.v1.JobData
ComplexFutureAction<T> - Class in org.apache.spark: A FutureAction for actions that could trigger multiple Spark jobs.
ComplexFutureAction() - Constructor for class org.apache.spark.ComplexFutureAction
compressed() - Method in interface org.apache.spark.mllib.linalg.Vector: Returns a vector in either dense or sparse format, whichever uses less storage.
compressedInputStream(InputStream) - Method in interface org.apache.spark.io.CompressionCodec
compressedInputStream(InputStream) - Method in class org.apache.spark.io.LZ4CompressionCodec
compressedInputStream(InputStream) - Method in class org.apache.spark.io.LZFCompressionCodec
compressedInputStream(InputStream) - Method in class org.apache.spark.io.SnappyCompressionCodec
compressedOutputStream(OutputStream) - Method in interface org.apache.spark.io.CompressionCodec
compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.LZ4CompressionCodec
compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.LZFCompressionCodec
compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.SnappyCompressionCodec
CompressionCodec - Interface in org.apache.spark.io: :: DeveloperApi :: CompressionCodec allows the customization of choosing different compression implementations to be used in block storage.
compute(Partition, TaskContext) - Method in class org.apache.spark.api.r.BaseRRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.graphx.EdgeRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.graphx.VertexRDD: Provides the RDD[(VertexId, VD)] equivalent output.
compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.Gradient: Compute the gradient and loss given the features of a single data point.
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.Gradient: Compute the gradient and loss given the features of a single data point, add the gradient to a provided vector to avoid creating new objects, and return loss.
compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.HingeGradient
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.HingeGradient
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.L1Updater
compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.LeastSquaresGradient
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.LeastSquaresGradient
compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.LogisticGradient
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.LogisticGradient
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.SimpleUpdater
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.SquaredL2Updater
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.Updater: Compute an updated value for weights given the gradient, stepSize, iteration number and regularization parameter.
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.CoGroupedRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.HadoopRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.JdbcRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.NewHadoopRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.PartitionPruningRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.RDD: :: DeveloperApi :: Implemented by subclasses to compute a given partition.
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ShuffledRDD
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.UnionRDD
compute(Time) - Method in class org.apache.spark.streaming.api.java.JavaDStream: Generate an RDD for the given duration
compute(Time) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Method that generates a RDD for the given Duration
compute(Time) - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
compute(Time) - Method in class org.apache.spark.streaming.dstream.DStream: Method that generates a RDD for the given time
compute(Time) - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
computeColumnSummaryStatistics() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Computes column-wise summary statistics.
computeCost(DataFrame) - Method in class org.apache.spark.ml.clustering.KMeansModel: Return the K-means cost (sum of squared distances of points to their nearest center) for this model on the given data.
computeCost(Vector) - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel: Computes the squared distance between the input point and the cluster center it belongs to.
computeCost(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel: Computes the sum of squared distances between the input points and their corresponding cluster centers.
computeCost(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel: Java-friendly version of computeCost().
computeCost(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel: Return the K-means cost (sum of squared distances of points to their nearest center) for this model on the given data.
computeCovariance() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Computes the covariance matrix, treating each row as an observation.
computeError(org.apache.spark.mllib.tree.model.TreeEnsembleModel, RDD<LabeledPoint>) - Method in interface org.apache.spark.mllib.tree.loss.Loss: Method to calculate error of the base learner for the gradient boosting calculation.
computeError(double, double) - Method in interface org.apache.spark.mllib.tree.loss.Loss: Method to calculate loss when the predictions are already known.
computeGramianMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
computeGramianMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Computes the Gramian matrix A^T A.
computeInitialPredictionAndError(RDD<LabeledPoint>, double, DecisionTreeModel, Loss) - Static method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel: :: DeveloperApi :: Compute the initial predictions and errors for a dataset for the first iteration of gradient boosting.
computePreferredLocations(Seq<InputFormatInfo>) - Static method in class org.apache.spark.scheduler.InputFormatInfo: Computes the preferred locations based on input(s) and returned a location to block map.
computePrincipalComponents(int) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Computes the top k principal components.
computeSVD(int, boolean, double) - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
computeSVD(int, boolean, double) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Computes singular value decomposition of this matrix.
concat(Column...) - Static method in class org.apache.spark.sql.functions: Concatenates multiple input string columns together into a single string column.
concat(Seq<Column>) - Static method in class org.apache.spark.sql.functions: Concatenates multiple input string columns together into a single string column.
concat_ws(String, Column...) - Static method in class org.apache.spark.sql.functions: Concatenates multiple input string columns together into a single string column, using the given separator.
concat_ws(String, Seq<Column>) - Static method in class org.apache.spark.sql.functions: Concatenates multiple input string columns together into a single string column, using the given separator.
conf() - Method in class org.apache.spark.SparkEnv
conf() - Method in class org.apache.spark.sql.hive.HiveContext
conf() - Method in class org.apache.spark.sql.SQLContext
conf() - Method in class org.apache.spark.streaming.StreamingContext
confidence() - Method in class org.apache.spark.mllib.fpm.AssociationRules.Rule: Returns the confidence of the rule.
confidence() - Method in class org.apache.spark.partial.BoundedDouble
configuration() - Method in class org.apache.spark.scheduler.InputFormatInfo
CONFIGURATION_INSTANTIATION_LOCK() - Static method in class org.apache.spark.rdd.HadoopRDD: Configuration's constructor is not threadsafe (see SPARK-1097 and HADOOP-10456).
CONFIGURATION_INSTANTIATION_LOCK() - Static method in class org.apache.spark.rdd.NewHadoopRDD: Configuration's constructor is not threadsafe (see SPARK-1097 and HADOOP-10456).
configure() - Method in class org.apache.spark.sql.hive.HiveContext: Overridden by child classes that need to set configuration before the client init.
confusionMatrix() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns confusion matrix: predicted classes are in columns, they are ordered by class label ascending, as in "labels"
connectedComponents() - Method in class org.apache.spark.graphx.GraphOps: Compute the connected component membership of each vertex and return a graph with the vertex value containing the lowest vertex id in the connected component containing that vertex.
ConnectedComponents - Class in org.apache.spark.graphx.lib: Connected components algorithm.
ConnectedComponents() - Constructor for class org.apache.spark.graphx.lib.ConnectedComponents
consequent() - Method in class org.apache.spark.mllib.fpm.AssociationRules.Rule
ConstantInputDStream<T> - Class in org.apache.spark.streaming.dstream: An input stream that always returns the same RDD on each timestep.
ConstantInputDStream(StreamingContext, RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.ConstantInputDStream
contains(Param<?>) - Method in class org.apache.spark.ml.param.ParamMap: Checks whether a parameter is explicitly specified.
contains(String) - Method in class org.apache.spark.SparkConf: Does the configuration contain a given parameter?
contains(Object) - Method in class org.apache.spark.sql.Column: Contains the other element.
contains(String) - Method in class org.apache.spark.sql.types.Metadata: Tests whether this Metadata contains a binding for a key.
containsBlock(BlockId) - Method in class org.apache.spark.storage.StorageStatus: Return whether the given block is stored in this block manager in O(1) time.
containsCachedMetadata(String) - Static method in class org.apache.spark.rdd.HadoopRDD
containsNull() - Method in class org.apache.spark.sql.types.ArrayType
context() - Method in interface org.apache.spark.api.java.JavaRDDLike: The SparkContext that this RDD was created on.
context() - Method in class org.apache.spark.InterruptibleIterator
context(SQLContext) - Method in class org.apache.spark.ml.util.MLReader
context(SQLContext) - Method in class org.apache.spark.ml.util.MLWriter
context() - Method in class org.apache.spark.rdd.RDD: The SparkContext that this RDD was created on.
context() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return the StreamingContext associated with this DStream
context() - Method in class org.apache.spark.streaming.dstream.DStream: Return the StreamingContext associated with this DStream
Continuous() - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
ContinuousSplit - Class in org.apache.spark.ml.tree: :: DeveloperApi :: Split which tests a continuous feature.
conv(Column, int, int) - Static method in class org.apache.spark.sql.functions: Convert a number in a string column from one base to another.
CONVERT_CTAS() - Static method in class org.apache.spark.sql.hive.HiveContext
CONVERT_METASTORE_PARQUET() - Static method in class org.apache.spark.sql.hive.HiveContext
CONVERT_METASTORE_PARQUET_WITH_SCHEMA_MERGING() - Static method in class org.apache.spark.sql.hive.HiveContext
convertCTAS() - Method in class org.apache.spark.sql.hive.HiveContext: When true, a table created by a Hive CTAS statement (no USING clause) will be converted to a data source table, using the data source set by spark.sql.sources.default.
convertMetastoreParquet() - Method in class org.apache.spark.sql.hive.HiveContext: When true, enables an experimental feature where metastore tables that use the parquet SerDe are automatically converted to use the Spark SQL parquet table scan, instead of the Hive SerDe.
convertMetastoreParquetWithSchemaMerging() - Method in class org.apache.spark.sql.hive.HiveContext: When true, also tries to merge possibly different but compatible Parquet schemas in different Parquet data files.
convertToCanonicalEdges(Function2<ED, ED, ED>) - Method in class org.apache.spark.graphx.GraphOps: Convert bi-directional edges into uni-directional ones.
CoordinateMatrix - Class in org.apache.spark.mllib.linalg.distributed
CoordinateMatrix(RDD<MatrixEntry>, long, long) - Constructor for class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
CoordinateMatrix(RDD<MatrixEntry>) - Constructor for class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
copy(ParamMap) - Method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
copy(ParamMap) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
copy(ParamMap) - Method in class org.apache.spark.ml.classification.GBTClassificationModel
copy(ParamMap) - Method in class org.apache.spark.ml.classification.GBTClassifier
copy(ParamMap) - Method in class org.apache.spark.ml.classification.LogisticRegression
copy(ParamMap) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
copy(ParamMap) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassificationModel
copy(ParamMap) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier
copy(ParamMap) - Method in class org.apache.spark.ml.classification.NaiveBayes
copy(ParamMap) - Method in class org.apache.spark.ml.classification.NaiveBayesModel
copy(ParamMap) - Method in class org.apache.spark.ml.classification.OneVsRest
copy(ParamMap) - Method in class org.apache.spark.ml.classification.OneVsRestModel
copy(ParamMap) - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel
copy(ParamMap) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
copy(ParamMap) - Method in class org.apache.spark.ml.clustering.DistributedLDAModel
copy(ParamMap) - Method in class org.apache.spark.ml.clustering.KMeans
copy(ParamMap) - Method in class org.apache.spark.ml.clustering.KMeansModel
copy(ParamMap) - Method in class org.apache.spark.ml.clustering.LDA
copy(ParamMap) - Method in class org.apache.spark.ml.clustering.LocalLDAModel
copy(ParamMap) - Method in class org.apache.spark.ml.Estimator
copy(ParamMap) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
copy(ParamMap) - Method in class org.apache.spark.ml.evaluation.Evaluator
copy(ParamMap) - Method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
copy(ParamMap) - Method in class org.apache.spark.ml.evaluation.RegressionEvaluator
copy(ParamMap) - Method in class org.apache.spark.ml.feature.Binarizer
copy(ParamMap) - Method in class org.apache.spark.ml.feature.Bucketizer
copy(ParamMap) - Method in class org.apache.spark.ml.feature.ChiSqSelector
copy(ParamMap) - Method in class org.apache.spark.ml.feature.ChiSqSelectorModel
copy(ParamMap) - Method in class org.apache.spark.ml.feature.ColumnPruner
copy(ParamMap) - Method in class org.apache.spark.ml.feature.CountVectorizer
copy(ParamMap) - Method in class org.apache.spark.ml.feature.CountVectorizerModel
copy(ParamMap) - Method in class org.apache.spark.ml.feature.HashingTF
copy(ParamMap) - Method in class org.apache.spark.ml.feature.IDF
copy(ParamMap) - Method in class org.apache.spark.ml.feature.IDFModel
copy(ParamMap) - Method in class org.apache.spark.ml.feature.IndexToString
copy(ParamMap) - Method in class org.apache.spark.ml.feature.Interaction
copy(ParamMap) - Method in class org.apache.spark.ml.feature.MinMaxScaler
copy(ParamMap) - Method in class org.apache.spark.ml.feature.MinMaxScalerModel
copy(ParamMap) - Method in class org.apache.spark.ml.feature.OneHotEncoder
copy(ParamMap) - Method in class org.apache.spark.ml.feature.PCA
copy(ParamMap) - Method in class org.apache.spark.ml.feature.PCAModel
copy(ParamMap) - Method in class org.apache.spark.ml.feature.PolynomialExpansion
copy(ParamMap) - Method in class org.apache.spark.ml.feature.QuantileDiscretizer
copy(ParamMap) - Method in class org.apache.spark.ml.feature.RegexTokenizer
copy(ParamMap) - Method in class org.apache.spark.ml.feature.RFormula
copy(ParamMap) - Method in class org.apache.spark.ml.feature.RFormulaModel
copy(ParamMap) - Method in class org.apache.spark.ml.feature.SQLTransformer
copy(ParamMap) - Method in class org.apache.spark.ml.feature.StandardScaler
copy(ParamMap) - Method in class org.apache.spark.ml.feature.StandardScalerModel
copy(ParamMap) - Method in class org.apache.spark.ml.feature.StopWordsRemover
copy(ParamMap) - Method in class org.apache.spark.ml.feature.StringIndexer
copy(ParamMap) - Method in class org.apache.spark.ml.feature.StringIndexerModel
copy(ParamMap) - Method in class org.apache.spark.ml.feature.Tokenizer
copy(ParamMap) - Method in class org.apache.spark.ml.feature.VectorAssembler
copy(ParamMap) - Method in class org.apache.spark.ml.feature.VectorAttributeRewriter
copy(ParamMap) - Method in class org.apache.spark.ml.feature.VectorIndexer
copy(ParamMap) - Method in class org.apache.spark.ml.feature.VectorIndexerModel
copy(ParamMap) - Method in class org.apache.spark.ml.feature.VectorSlicer
copy(ParamMap) - Method in class org.apache.spark.ml.feature.Word2Vec
copy(ParamMap) - Method in class org.apache.spark.ml.feature.Word2VecModel
copy(ParamMap) - Method in class org.apache.spark.ml.Model
copy() - Method in class org.apache.spark.ml.param.ParamMap: Creates a copy of this param map.
copy(ParamMap) - Method in interface org.apache.spark.ml.param.Params: Creates a copy of this instance with the same UID and some extra params.
copy(ParamMap) - Method in class org.apache.spark.ml.Pipeline
copy(ParamMap) - Method in class org.apache.spark.ml.PipelineModel
copy(ParamMap) - Method in class org.apache.spark.ml.PipelineStage
copy(ParamMap) - Method in class org.apache.spark.ml.Predictor
copy(ParamMap) - Method in class org.apache.spark.ml.recommendation.ALS
copy(ParamMap) - Method in class org.apache.spark.ml.recommendation.ALSModel
copy(ParamMap) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
copy(ParamMap) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
copy(ParamMap) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
copy(ParamMap) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
copy(ParamMap) - Method in class org.apache.spark.ml.regression.GBTRegressionModel
copy(ParamMap) - Method in class org.apache.spark.ml.regression.GBTRegressor
copy(ParamMap) - Method in class org.apache.spark.ml.regression.IsotonicRegression
copy(ParamMap) - Method in class org.apache.spark.ml.regression.IsotonicRegressionModel
copy(ParamMap) - Method in class org.apache.spark.ml.regression.LinearRegression
copy(ParamMap) - Method in class org.apache.spark.ml.regression.LinearRegressionModel
copy(ParamMap) - Method in class org.apache.spark.ml.regression.RandomForestRegressionModel
copy(ParamMap) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
copy(ParamMap) - Method in class org.apache.spark.ml.Transformer
copy(ParamMap) - Method in class org.apache.spark.ml.tuning.CrossValidator
copy(ParamMap) - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
copy(ParamMap) - Method in class org.apache.spark.ml.tuning.TrainValidationSplit
copy(ParamMap) - Method in class org.apache.spark.ml.tuning.TrainValidationSplitModel
copy(ParamMap) - Method in class org.apache.spark.ml.UnaryTransformer
copy() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
copy() - Method in class org.apache.spark.mllib.linalg.DenseVector
copy() - Method in interface org.apache.spark.mllib.linalg.Matrix: Get a deep copy of the matrix.
copy() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
copy() - Method in class org.apache.spark.mllib.linalg.SparseVector
copy() - Method in interface org.apache.spark.mllib.linalg.Vector: Makes a deep copy of this vector.
copy() - Method in class org.apache.spark.mllib.random.ExponentialGenerator
copy() - Method in class org.apache.spark.mllib.random.GammaGenerator
copy() - Method in class org.apache.spark.mllib.random.LogNormalGenerator
copy() - Method in class org.apache.spark.mllib.random.PoissonGenerator
copy() - Method in interface org.apache.spark.mllib.random.RandomDataGenerator: Returns a copy of the RandomDataGenerator with a new instance of the rng object used in the class when applicable for non-locking concurrent usage.
copy() - Method in class org.apache.spark.mllib.random.StandardNormalGenerator
copy() - Method in class org.apache.spark.mllib.random.UniformGenerator
copy() - Method in class org.apache.spark.mllib.random.WeibullGenerator
copy() - Method in class org.apache.spark.mllib.tree.configuration.Strategy: Returns a shallow copy of this instance.
copy() - Method in interface org.apache.spark.sql.Row: Make a copy of the current Row object.
copy() - Method in class org.apache.spark.util.StatCounter: Clone this StatCounter
copyValues(T, ParamMap) - Method in interface org.apache.spark.ml.param.Params: Copies param values from this instance to another instance for params shared by them.
coresGranted() - Method in class org.apache.spark.status.api.v1.ApplicationInfo
coresPerExecutor() - Method in class org.apache.spark.status.api.v1.ApplicationInfo
corr(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.Statistics: Compute the Pearson correlation matrix for the input RDD of Vectors.
corr(RDD<Vector>, String) - Static method in class org.apache.spark.mllib.stat.Statistics: Compute the correlation matrix for the input RDD of Vectors using the specified method.
corr(RDD<Object>, RDD<Object>) - Static method in class org.apache.spark.mllib.stat.Statistics: Compute the Pearson correlation for the input RDDs.
corr(JavaRDD<Double>, JavaRDD<Double>) - Static method in class org.apache.spark.mllib.stat.Statistics: Java-friendly version of corr()
corr(RDD<Object>, RDD<Object>, String) - Static method in class org.apache.spark.mllib.stat.Statistics: Compute the correlation for the input RDDs using the specified method.
corr(JavaRDD<Double>, JavaRDD<Double>, String) - Static method in class org.apache.spark.mllib.stat.Statistics: Java-friendly version of corr()
corr(String, String, String) - Method in class org.apache.spark.sql.DataFrameStatFunctions: Calculates the correlation of two columns of a DataFrame.
corr(String, String) - Method in class org.apache.spark.sql.DataFrameStatFunctions: Calculates the Pearson Correlation Coefficient of two columns of a DataFrame.
corr(Column, Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the Pearson Correlation Coefficient for two columns.
corr(String, String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the Pearson Correlation Coefficient for two columns.
cos(Column) - Static method in class org.apache.spark.sql.functions: Computes the cosine of the given value.
cos(String) - Static method in class org.apache.spark.sql.functions: Computes the cosine of the given column.
cosh(Column) - Static method in class org.apache.spark.sql.functions: Computes the hyperbolic cosine of the given value.
cosh(String) - Static method in class org.apache.spark.sql.functions: Computes the hyperbolic cosine of the given column.
count() - Method in interface org.apache.spark.api.java.JavaRDDLike: Return the number of elements in the RDD.
count() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl: The number of edges in the RDD.
count() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl: The number of vertices in the RDD.
count() - Method in class org.apache.spark.ml.regression.AFTAggregator
count() - Method in class org.apache.spark.ml.regression.LeastSquaresAggregator
count() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer: Sample size.
count() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary: Sample size.
count() - Method in class org.apache.spark.rdd.RDD: Return the number of elements in the RDD.
count() - Method in class org.apache.spark.sql.DataFrame: Returns the number of rows in the DataFrame.
count() - Method in class org.apache.spark.sql.Dataset: Returns the number of elements in the Dataset.
count(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the number of items in a group.
count(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the number of items in a group.
count() - Method in class org.apache.spark.sql.GroupedData: Count the number of rows for each group.
count() - Method in class org.apache.spark.sql.GroupedDataset: Returns a Dataset that contains a tuple with each key and the number of items present for that key.
count() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD has a single element generated by counting each RDD of this DStream.
count() - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD has a single element generated by counting each RDD of this DStream.
count() - Method in class org.apache.spark.streaming.kafka.OffsetRange: Number of messages this OffsetRange refers to
count() - Method in class org.apache.spark.util.StatCounter
countApprox(long, double) - Method in interface org.apache.spark.api.java.JavaRDDLike: Approximate version of count() that returns a potentially incomplete result within a timeout, even if not all tasks have finished.
countApprox(long) - Method in interface org.apache.spark.api.java.JavaRDDLike: Approximate version of count() that returns a potentially incomplete result within a timeout, even if not all tasks have finished.
countApprox(long, double) - Method in class org.apache.spark.rdd.RDD: Approximate version of count() that returns a potentially incomplete result within a timeout, even if not all tasks have finished.
countApproxDistinct(double) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return approximate number of distinct elements in the RDD.
countApproxDistinct(int, int) - Method in class org.apache.spark.rdd.RDD: Return approximate number of distinct elements in the RDD.
countApproxDistinct(double) - Method in class org.apache.spark.rdd.RDD: Return approximate number of distinct elements in the RDD.
countApproxDistinctByKey(double, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double) - Method in class org.apache.spark.api.java.JavaPairRDD: Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(int, int, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return approximate number of distinct values for each key in this RDD.
countAsync() - Method in interface org.apache.spark.api.java.JavaRDDLike: The asynchronous version of count, which returns a future for counting the number of elements in this RDD.
countAsync() - Method in class org.apache.spark.rdd.AsyncRDDActions: Returns a future for counting the number of elements in the RDD.
countByKey() - Method in class org.apache.spark.api.java.JavaPairRDD: Count the number of elements for each key, and return the result to the master as a Map.
countByKey() - Method in class org.apache.spark.rdd.PairRDDFunctions: Count the number of elements for each key, collecting the results to a local Map.
countByKeyApprox(long) - Method in class org.apache.spark.api.java.JavaPairRDD: Approximate version of countByKey that can return a partial result if it does not finish within a timeout.
countByKeyApprox(long, double) - Method in class org.apache.spark.api.java.JavaPairRDD: Approximate version of countByKey that can return a partial result if it does not finish within a timeout.
countByKeyApprox(long, double) - Method in class org.apache.spark.rdd.PairRDDFunctions: Approximate version of countByKey that can return a partial result if it does not finish within a timeout.
countByValue() - Method in interface org.apache.spark.api.java.JavaRDDLike: Return the count of each unique value in this RDD as a map of (value, count) pairs.
countByValue(Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Return the count of each unique value in this RDD as a local map of (value, count) pairs.
countByValue() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD contains the counts of each distinct value in each RDD of this DStream.
countByValue(int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD contains the counts of each distinct value in each RDD of this DStream.
countByValue(int, Ordering<T>) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD contains the counts of each distinct value in each RDD of this DStream.
countByValueAndWindow(Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD contains the count of distinct elements in RDDs in a sliding window over this DStream.
countByValueAndWindow(Duration, Duration, int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD contains the count of distinct elements in RDDs in a sliding window over this DStream.
countByValueAndWindow(Duration, Duration, int, Ordering<T>) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD contains the count of distinct elements in RDDs in a sliding window over this DStream.
countByValueApprox(long, double) - Method in interface org.apache.spark.api.java.JavaRDDLike: (Experimental) Approximate version of countByValue().
countByValueApprox(long) - Method in interface org.apache.spark.api.java.JavaRDDLike: (Experimental) Approximate version of countByValue().
countByValueApprox(long, double, Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Approximate version of countByValue().
countByWindow(Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD has a single element generated by counting the number of elements in a window over this DStream.
countByWindow(Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD has a single element generated by counting the number of elements in a sliding window over this DStream.
countDistinct(Column, Column...) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the number of distinct items in a group.
countDistinct(String, String...) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the number of distinct items in a group.
countDistinct(Column, Seq<Column>) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the number of distinct items in a group.
countDistinct(String, Seq<String>) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the number of distinct items in a group.
countTowardsTaskFailures() - Method in class org.apache.spark.ExecutorLostFailure
countTowardsTaskFailures() - Method in class org.apache.spark.TaskCommitDenied: If a task failed because its attempt to commit was denied, do not count this failure towards failing the stage.
countTowardsTaskFailures() - Method in interface org.apache.spark.TaskFailedReason: Whether this task failure should be counted towards the maximum number of times the task is allowed to fail before the stage is aborted.
CountVectorizer - Class in org.apache.spark.ml.feature: :: Experimental :: Extracts a vocabulary from document collections and generates a CountVectorizerModel.
CountVectorizer(String) - Constructor for class org.apache.spark.ml.feature.CountVectorizer
CountVectorizer() - Constructor for class org.apache.spark.ml.feature.CountVectorizer
CountVectorizerModel - Class in org.apache.spark.ml.feature: :: Experimental :: Converts a text document to a sparse vector of token counts.
CountVectorizerModel(String, String[]) - Constructor for class org.apache.spark.ml.feature.CountVectorizerModel
CountVectorizerModel(String[]) - Constructor for class org.apache.spark.ml.feature.CountVectorizerModel
cov(String, String) - Method in class org.apache.spark.sql.DataFrameStatFunctions: Calculate the sample covariance of two numerical columns of a DataFrame.
crc32(Column) - Static method in class org.apache.spark.sql.functions: Calculates the cyclic redundancy check value (CRC32) of a binary column and returns the value as a bigint.
CreatableRelationProvider - Interface in org.apache.spark.sql.sources
create(boolean, boolean, boolean, int) - Static method in class org.apache.spark.api.java.StorageLevels: Deprecated.
create(boolean, boolean, boolean, boolean, int) - Static method in class org.apache.spark.api.java.StorageLevels: Create a new StorageLevel object.
create(JavaSparkContext, JdbcRDD.ConnectionFactory, String, long, long, int, Function<ResultSet, T>) - Static method in class org.apache.spark.rdd.JdbcRDD: Create an RDD that executes an SQL query on a JDBC connection and reads results.
create(JavaSparkContext, JdbcRDD.ConnectionFactory, String, long, long, int) - Static method in class org.apache.spark.rdd.JdbcRDD: Create an RDD that executes an SQL query on a JDBC connection and reads results.
create(RDD<T>, Function1<Object, Object>) - Static method in class org.apache.spark.rdd.PartitionPruningRDD: Create a PartitionPruningRDD.
create(Object...) - Static method in class org.apache.spark.sql.RowFactory: Create a Row from the given arguments.
create() - Method in interface org.apache.spark.streaming.api.java.JavaStreamingContextFactory
create(String, int) - Static method in class org.apache.spark.streaming.kafka.Broker
create(String, int, long, long) - Static method in class org.apache.spark.streaming.kafka.OffsetRange
create(TopicAndPartition, long, long) - Static method in class org.apache.spark.streaming.kafka.OffsetRange
createArrayType(DataType) - Static method in class org.apache.spark.sql.types.DataTypes: Creates an ArrayType by specifying the data type of elements (elementType).
createArrayType(DataType, boolean) - Static method in class org.apache.spark.sql.types.DataTypes: Creates an ArrayType by specifying the data type of elements (elementType) and whether the array contains null values (containsNull).
createCombiner() - Method in class org.apache.spark.Aggregator
createDataFrame(RDD<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLContext
createDataFrame(Seq<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLContext
createDataFrame(RDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
createDataFrame(JavaRDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
createDataFrame(List<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
createDataFrame(RDD<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
createDataFrame(JavaRDD<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
createDataFrame(List<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
createDataset(Seq<T>, Encoder<T>) - Method in class org.apache.spark.sql.SQLContext
createDataset(RDD<T>, Encoder<T>) - Method in class org.apache.spark.sql.SQLContext
createDataset(List<T>, Encoder<T>) - Method in class org.apache.spark.sql.SQLContext
createDecimalType(int, int) - Static method in class org.apache.spark.sql.types.DataTypes: Creates a DecimalType by specifying the precision and scale.
createDecimalType() - Static method in class org.apache.spark.sql.types.DataTypes: Creates a DecimalType with default precision and scale, which are 10 and 0.
createDirectStream(StreamingContext, Map<String, String>, Map<TopicAndPartition, Object>, Function1<MessageAndMetadata<K, V>, R>, ClassTag<K>, ClassTag<V>, ClassTag<KD>, ClassTag<VD>, ClassTag<R>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils: Create an input stream that directly pulls messages from Kafka Brokers without using any receiver.
createDirectStream(StreamingContext, Map<String, String>, Set<String>, ClassTag<K>, ClassTag<V>, ClassTag<KD>, ClassTag<VD>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils: Create an input stream that directly pulls messages from Kafka Brokers without using any receiver.
createDirectStream(JavaStreamingContext, Class<K>, Class<V>, Class<KD>, Class<VD>, Class<R>, Map<String, String>, Map<TopicAndPartition, Long>, Function<MessageAndMetadata<K, V>, R>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils: Create an input stream that directly pulls messages from Kafka Brokers without using any receiver.
createDirectStream(JavaStreamingContext, Class<K>, Class<V>, Class<KD>, Class<VD>, Map<String, String>, Set<String>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils: Create an input stream that directly pulls messages from Kafka Brokers without using any receiver.
createExternalTable(String, String) - Method in class org.apache.spark.sql.SQLContext
createExternalTable(String, String, String) - Method in class org.apache.spark.sql.SQLContext
createExternalTable(String, String, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
createExternalTable(String, String, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
createExternalTable(String, String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
createExternalTable(String, String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
createJDBCTable(String, String, boolean) - Method in class org.apache.spark.sql.DataFrame: Deprecated.
As of 1.340, replaced by write().jdbc(). This will be removed in Spark 2.0.
createLogDir() - Method in class org.apache.spark.scheduler.JobLogger: Create a folder for log files, the folder's name is the creation time of jobLogger
createLogWriter(int) - Method in class org.apache.spark.scheduler.JobLogger: Create a log file for one job
createMapType(DataType, DataType) - Static method in class org.apache.spark.sql.types.DataTypes: Creates a MapType by specifying the data type of keys (keyType) and values (keyType).
createMapType(DataType, DataType, boolean) - Static method in class org.apache.spark.sql.types.DataTypes: Creates a MapType by specifying the data type of keys (keyType), the data type of values (keyType), and whether values contain any null value (valueContainsNull).
createModel(Vector, double) - Method in class org.apache.spark.mllib.classification.LogisticRegressionWithLBFGS
createModel(Vector, double) - Method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
createModel(Vector, double) - Method in class org.apache.spark.mllib.classification.SVMWithSGD
createModel(Vector, double) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm: Create a model given the weights and intercept
createModel(Vector, double) - Method in class org.apache.spark.mllib.regression.LassoWithSGD
createModel(Vector, double) - Method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
createModel(Vector, double) - Method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
createPollingStream(StreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(StreamingContext, Seq<InetSocketAddress>, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(StreamingContext, Seq<InetSocketAddress>, StorageLevel, int, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(JavaStreamingContext, String, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(JavaStreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(JavaStreamingContext, InetSocketAddress[], StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(JavaStreamingContext, InetSocketAddress[], StorageLevel, int, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createRDD(SparkContext, Map<String, String>, OffsetRange[], ClassTag<K>, ClassTag<V>, ClassTag<KD>, ClassTag<VD>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils: Create a RDD from Kafka using offset ranges for each topic and partition.
createRDD(SparkContext, Map<String, String>, OffsetRange[], Map<TopicAndPartition, Broker>, Function1<MessageAndMetadata<K, V>, R>, ClassTag<K>, ClassTag<V>, ClassTag<KD>, ClassTag<VD>, ClassTag<R>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils: Create a RDD from Kafka using offset ranges for each topic and partition.
createRDD(JavaSparkContext, Class<K>, Class<V>, Class<KD>, Class<VD>, Map<String, String>, OffsetRange[]) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils: Create a RDD from Kafka using offset ranges for each topic and partition.
createRDD(JavaSparkContext, Class<K>, Class<V>, Class<KD>, Class<VD>, Class<R>, Map<String, String>, OffsetRange[], Map<TopicAndPartition, Broker>, Function<MessageAndMetadata<K, V>, R>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils: Create a RDD from Kafka using offset ranges for each topic and partition.
createRDDFromArray(JavaSparkContext, byte[][]) - Static method in class org.apache.spark.api.r.RRDD: Create an RRDD given a sequence of byte arrays.
createRDDWithLocalProperties(Time, boolean, Function0) - Method in class org.apache.spark.streaming.dstream.DStream: Wrap a body of code such that the call site and operation scope information are passed to the RDDs created in this body properly.
createRelation(SQLContext, Map<String, String>) - Method in class org.apache.spark.ml.source.libsvm.DefaultSource
createRelation(SQLContext, SaveMode, Map<String, String>, DataFrame) - Method in interface org.apache.spark.sql.sources.CreatableRelationProvider: Creates a relation with the given parameters based on the contents of the given DataFrame.
createRelation(SQLContext, String[], Option<StructType>, Option<StructType>, Map<String, String>) - Method in interface org.apache.spark.sql.sources.HadoopFsRelationProvider: Returns a new base relation with the given parameters, a user defined schema, and a list of partition columns.
createRelation(SQLContext, Map<String, String>) - Method in interface org.apache.spark.sql.sources.RelationProvider: Returns a new base relation with the given parameters.
createRelation(SQLContext, Map<String, String>, StructType) - Method in interface org.apache.spark.sql.sources.SchemaRelationProvider: Returns a new base relation with the given parameters and user defined schema.
createRWorker(int) - Static method in class org.apache.spark.api.r.RRDD: ProcessBuilder used to launch worker R processes.
createSparkContext(String, String, String, String[], Map<Object, Object>, Map<Object, Object>) - Static method in class org.apache.spark.api.r.RRDD
createStream(StreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Create a input stream from a Flume source.
createStream(StreamingContext, String, int, StorageLevel, boolean) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Create a input stream from a Flume source.
createStream(JavaStreamingContext, String, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Creates a input stream from a Flume source.
createStream(JavaStreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Creates a input stream from a Flume source.
createStream(JavaStreamingContext, String, int, StorageLevel, boolean) - Static method in class org.apache.spark.streaming.flume.FlumeUtils: Creates a input stream from a Flume source.
createStream(StreamingContext, String, String, Map<String, Object>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils: Create an input stream that pulls messages from Kafka Brokers.
createStream(StreamingContext, Map<String, String>, Map<String, Object>, StorageLevel, ClassTag<K>, ClassTag<V>, ClassTag, ClassTag<T>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils: Create an input stream that pulls messages from Kafka Brokers.
createStream(JavaStreamingContext, String, String, Map<String, Integer>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils: Create an input stream that pulls messages from Kafka Brokers.
createStream(JavaStreamingContext, String, String, Map<String, Integer>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils: Create an input stream that pulls messages from Kafka Brokers.
createStream(JavaStreamingContext, Class<K>, Class<V>, Class, Class<T>, Map<String, String>, Map<String, Integer>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils: Create an input stream that pulls messages from Kafka Brokers.
createStream(StreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel, Function1<Record, T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils: Create an input stream that pulls messages from a Kinesis stream.
createStream(StreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel, Function1<Record, T>, String, String, ClassTag<T>) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils: Create an input stream that pulls messages from a Kinesis stream.
createStream(StreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils: Create an input stream that pulls messages from a Kinesis stream.
createStream(StreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel, String, String) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils: Create an input stream that pulls messages from a Kinesis stream.
createStream(StreamingContext, String, String, Duration, InitialPositionInStream, StorageLevel) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils: Create an input stream that pulls messages from a Kinesis stream.
createStream(JavaStreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel, Function<Record, T>, Class<T>) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils: Create an input stream that pulls messages from a Kinesis stream.
createStream(JavaStreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel, Function<Record, T>, Class<T>, String, String) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils: Create an input stream that pulls messages from a Kinesis stream.
createStream(JavaStreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils: Create an input stream that pulls messages from a Kinesis stream.
createStream(JavaStreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel, String, String) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils: Create an input stream that pulls messages from a Kinesis stream.
createStream(JavaStreamingContext, String, String, Duration, InitialPositionInStream, StorageLevel) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils: Create an input stream that pulls messages from a Kinesis stream.
createStream(JavaStreamingContext, String, String, String, String, int, Duration, StorageLevel, String, String) - Method in class org.apache.spark.streaming.kinesis.KinesisUtilsPythonHelper
createStream(StreamingContext, String, String, StorageLevel) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils: Create an input stream that receives messages pushed by a MQTT publisher.
createStream(JavaStreamingContext, String, String) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils: Create an input stream that receives messages pushed by a MQTT publisher.
createStream(JavaStreamingContext, String, String, StorageLevel) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils: Create an input stream that receives messages pushed by a MQTT publisher.
createStream(StreamingContext, Option<Authorization>, Seq<String>, StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils: Create a input stream that returns tweets received from Twitter.
createStream(JavaStreamingContext) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils: Create a input stream that returns tweets received from Twitter using Twitter4J's default OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey, twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and twitter4j.oauth.accessTokenSecret.
createStream(JavaStreamingContext, String[]) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils: Create a input stream that returns tweets received from Twitter using Twitter4J's default OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey, twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and twitter4j.oauth.accessTokenSecret.
createStream(JavaStreamingContext, String[], StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils: Create a input stream that returns tweets received from Twitter using Twitter4J's default OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey, twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and twitter4j.oauth.accessTokenSecret.
createStream(JavaStreamingContext, Authorization) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils: Create a input stream that returns tweets received from Twitter.
createStream(JavaStreamingContext, Authorization, String[]) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils: Create a input stream that returns tweets received from Twitter.
createStream(JavaStreamingContext, Authorization, String[], StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils: Create a input stream that returns tweets received from Twitter.
createStream(StreamingContext, String, Subscribe, Function1<Seq<ByteString>, Iterator<T>>, StorageLevel, SupervisorStrategy, ClassTag<T>) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils: Create an input stream that receives messages pushed by a zeromq publisher.
createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>, StorageLevel, SupervisorStrategy) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils: Create an input stream that receives messages pushed by a zeromq publisher.
createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>, StorageLevel) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils: Create an input stream that receives messages pushed by a zeromq publisher.
createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils: Create an input stream that receives messages pushed by a zeromq publisher.
createStructField(String, DataType, boolean, Metadata) - Static method in class org.apache.spark.sql.types.DataTypes: Creates a StructField by specifying the name (name), data type (dataType) and whether values of this field can be null values (nullable).
createStructField(String, DataType, boolean) - Static method in class org.apache.spark.sql.types.DataTypes: Creates a StructField with empty metadata.
createStructType(List<StructField>) - Static method in class org.apache.spark.sql.types.DataTypes: Creates a StructType with the given list of StructFields (fields).
createStructType(StructField[]) - Static method in class org.apache.spark.sql.types.DataTypes: Creates a StructType with the given StructField array (fields).
createTransformFunc() - Method in class org.apache.spark.ml.feature.DCT
createTransformFunc() - Method in class org.apache.spark.ml.feature.ElementwiseProduct
createTransformFunc() - Method in class org.apache.spark.ml.feature.NGram
createTransformFunc() - Method in class org.apache.spark.ml.feature.Normalizer
createTransformFunc() - Method in class org.apache.spark.ml.feature.PolynomialExpansion
createTransformFunc() - Method in class org.apache.spark.ml.feature.RegexTokenizer
createTransformFunc() - Method in class org.apache.spark.ml.feature.Tokenizer
createTransformFunc() - Method in class org.apache.spark.ml.UnaryTransformer: Creates the transform function using the given param map.
creationSite() - Method in class org.apache.spark.rdd.RDD: User code that created this RDD (e.g.
creationSite() - Method in class org.apache.spark.streaming.dstream.DStream
crosstab(String, String) - Method in class org.apache.spark.sql.DataFrameStatFunctions: Computes a pair-wise frequency table of the given columns.
CrossValidator - Class in org.apache.spark.ml.tuning: :: Experimental :: K-fold cross validation.
CrossValidator(String) - Constructor for class org.apache.spark.ml.tuning.CrossValidator
CrossValidator() - Constructor for class org.apache.spark.ml.tuning.CrossValidator
CrossValidatorModel - Class in org.apache.spark.ml.tuning: :: Experimental :: Model from k-fold cross validation.
cube(Column...) - Method in class org.apache.spark.sql.DataFrame: Create a multi-dimensional cube for the current DataFrame using the specified columns, so we can run aggregation on them.
cube(String, String...) - Method in class org.apache.spark.sql.DataFrame: Create a multi-dimensional cube for the current DataFrame using the specified columns, so we can run aggregation on them.
cube(Seq<Column>) - Method in class org.apache.spark.sql.DataFrame: Create a multi-dimensional cube for the current DataFrame using the specified columns, so we can run aggregation on them.
cube(String, Seq<String>) - Method in class org.apache.spark.sql.DataFrame: Create a multi-dimensional cube for the current DataFrame using the specified columns, so we can run aggregation on them.
cume_dist() - Static method in class org.apache.spark.sql.functions: Window function: returns the cumulative distribution of values within a window partition, i.e.
cumeDist() - Static method in class org.apache.spark.sql.functions: Deprecated.
As of 1.6.0, replaced by cume_dist. This will be removed in Spark 2.0.
current_date() - Static method in class org.apache.spark.sql.functions: Returns the current date as a date column.
current_timestamp() - Static method in class org.apache.spark.sql.functions: Returns the current timestamp as a timestamp column.
currentAttemptId() - Method in interface org.apache.spark.SparkStageInfo
currentAttemptId() - Method in class org.apache.spark.SparkStageInfoImpl
currPrefLocs(Partition) - Method in class org.apache.spark.rdd.PartitionCoalescer

D

databaseTypeDefinition() - Method in class org.apache.spark.sql.jdbc.JdbcType
dataDistribution() - Method in class org.apache.spark.status.api.v1.RDDStorageInfo
DataFrame - Class in org.apache.spark.sql: :: Experimental :: A distributed collection of data organized into named columns.
DataFrame(SQLContext, LogicalPlan) - Constructor for class org.apache.spark.sql.DataFrame: A constructor that automatically analyzes the logical plan.
DataFrameHolder - Class in org.apache.spark.sql: A container for a DataFrame, used for implicit conversions.
DataFrameNaFunctions - Class in org.apache.spark.sql: :: Experimental :: Functionality for working with missing data in DataFrames.
DataFrameReader - Class in org.apache.spark.sql: :: Experimental :: Interface used to load a DataFrame from external storage systems (e.g.
DataFrameStatFunctions - Class in org.apache.spark.sql: :: Experimental :: Statistic functions for DataFrames.
DataFrameWriter - Class in org.apache.spark.sql: :: Experimental :: Interface used to write a DataFrame to external storage systems (e.g.
dataSchema() - Method in class org.apache.spark.sql.sources.HadoopFsRelation: Specifies schema of actual data files.
Dataset<T> - Class in org.apache.spark.sql: :: Experimental :: A Dataset is a strongly typed collection of objects that can be transformed in parallel using functional or relational operations.
DatasetHolder<T> - Class in org.apache.spark.sql: A container for a Dataset, used for implicit conversions.
DataSourceRegister - Interface in org.apache.spark.sql.sources: ::DeveloperApi:: Data sources should implement this trait so that they can register an alias to their data source.
dataStream() - Method in class org.apache.spark.api.r.BaseRRDD
dataType() - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction: The DataType of the returned value of this UserDefinedAggregateFunction.
DataType - Class in org.apache.spark.sql.types: :: DeveloperApi :: The base type of all Spark SQL data types.
DataType() - Constructor for class org.apache.spark.sql.types.DataType
dataType() - Method in class org.apache.spark.sql.types.StructField
dataType() - Method in class org.apache.spark.sql.UserDefinedFunction
DataTypes - Class in org.apache.spark.sql.types: To get/create specific data type, users should use singleton objects and factory methods provided by this class.
DataTypes() - Constructor for class org.apache.spark.sql.types.DataTypes
DataValidators - Class in org.apache.spark.mllib.util: :: DeveloperApi :: A collection of methods used to validate data before applying ML algorithms.
DataValidators() - Constructor for class org.apache.spark.mllib.util.DataValidators
date() - Method in class org.apache.spark.sql.ColumnName: Creates a new StructField of type date.
DATE() - Static method in class org.apache.spark.sql.Encoders: An encoder for nullable date type.
date_add(Column, int) - Static method in class org.apache.spark.sql.functions: Returns the date that is days days after start
date_format(Column, String) - Static method in class org.apache.spark.sql.functions: Converts a date/timestamp/string to a value of string in the format specified by the date format given by the second argument.
date_sub(Column, int) - Static method in class org.apache.spark.sql.functions: Returns the date that is days days before start
datediff(Column, Column) - Static method in class org.apache.spark.sql.functions: Returns the number of days from start to end.
DateType - Static variable in class org.apache.spark.sql.types.DataTypes: Gets the DateType object.
DateType - Class in org.apache.spark.sql.types: :: DeveloperApi :: A date type, supporting "0001-01-01" through "9999-12-31".
dayofmonth(Column) - Static method in class org.apache.spark.sql.functions: Extracts the day of the month as an integer from a given date/timestamp/string.
dayofyear(Column) - Static method in class org.apache.spark.sql.functions: Extracts the day of the year as an integer from a given date/timestamp/string.
DB2Dialect - Class in org.apache.spark.sql.jdbc
DB2Dialect() - Constructor for class org.apache.spark.sql.jdbc.DB2Dialect
DCT - Class in org.apache.spark.ml.feature: :: Experimental :: A feature transformer that takes the 1D discrete cosine transform of a real vector.
DCT(String) - Constructor for class org.apache.spark.ml.feature.DCT
DCT() - Constructor for class org.apache.spark.ml.feature.DCT
ddlParser() - Method in class org.apache.spark.sql.SQLContext
decayFactor() - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
decimal() - Method in class org.apache.spark.sql.ColumnName: Creates a new StructField of type decimal.
decimal(int, int) - Method in class org.apache.spark.sql.ColumnName: Creates a new StructField of type decimal.
DECIMAL() - Static method in class org.apache.spark.sql.Encoders: An encoder for nullable decimal type.
Decimal - Class in org.apache.spark.sql.types: A mutable implementation of BigDecimal that can hold a Long if values are small enough.
Decimal() - Constructor for class org.apache.spark.sql.types.Decimal
DecimalType - Class in org.apache.spark.sql.types
DecimalType(int, int) - Constructor for class org.apache.spark.sql.types.DecimalType
DecimalType(int) - Constructor for class org.apache.spark.sql.types.DecimalType
DecimalType() - Constructor for class org.apache.spark.sql.types.DecimalType
DecimalType(Option<PrecisionInfo>) - Constructor for class org.apache.spark.sql.types.DecimalType
DecisionTree - Class in org.apache.spark.mllib.tree: A class which implements a decision tree learning algorithm for classification and regression.
DecisionTree(Strategy) - Constructor for class org.apache.spark.mllib.tree.DecisionTree
DecisionTreeClassificationModel - Class in org.apache.spark.ml.classification: :: Experimental :: Decision tree model for classification.
DecisionTreeClassifier - Class in org.apache.spark.ml.classification: :: Experimental :: Decision tree learning algorithm for classification.
DecisionTreeClassifier(String) - Constructor for class org.apache.spark.ml.classification.DecisionTreeClassifier
DecisionTreeClassifier() - Constructor for class org.apache.spark.ml.classification.DecisionTreeClassifier
DecisionTreeModel - Class in org.apache.spark.mllib.tree.model: Decision tree model for classification or regression.
DecisionTreeModel(Node, Enumeration.Value) - Constructor for class org.apache.spark.mllib.tree.model.DecisionTreeModel
DecisionTreeRegressionModel - Class in org.apache.spark.ml.regression: :: Experimental :: Decision tree model for regression.
DecisionTreeRegressor - Class in org.apache.spark.ml.regression: :: Experimental :: Decision tree learning algorithm for regression.
DecisionTreeRegressor(String) - Constructor for class org.apache.spark.ml.regression.DecisionTreeRegressor
DecisionTreeRegressor() - Constructor for class org.apache.spark.ml.regression.DecisionTreeRegressor
decode(Column, String) - Static method in class org.apache.spark.sql.functions: Computes the first argument into a string from a binary using the provided character set (one of 'US-ASCII', 'ISO-8859-1', 'UTF-8', 'UTF-16BE', 'UTF-16LE', 'UTF-16').
decodeLabel(Vector) - Static method in class org.apache.spark.ml.classification.LabelConverter: Converts a vector to a label.
defaultAttr() - Static method in class org.apache.spark.ml.attribute.BinaryAttribute: The default binary attribute.
defaultAttr() - Static method in class org.apache.spark.ml.attribute.NominalAttribute: The default nominal attribute.
defaultAttr() - Static method in class org.apache.spark.ml.attribute.NumericAttribute: The default numeric attribute.
defaultClassLoader() - Method in class org.apache.spark.serializer.Serializer: Default ClassLoader to use in deserialization.
defaultCopy(ParamMap) - Method in interface org.apache.spark.ml.param.Params: Default implementation of copy with extra params.
defaultMinPartitions() - Method in class org.apache.spark.api.java.JavaSparkContext: Default min number of partitions for Hadoop RDDs when not given by user
defaultMinPartitions() - Method in class org.apache.spark.SparkContext: Default min number of partitions for Hadoop RDDs when not given by user Notice that we use math.min so the "defaultMinPartitions" cannot be higher than 2.
defaultMinSplits() - Method in class org.apache.spark.api.java.JavaSparkContext: Deprecated.
As of Spark 1.0.0, defaultMinSplits is deprecated, use JavaSparkContext.defaultMinPartitions() instead
defaultMinSplits() - Method in class org.apache.spark.SparkContext: Default min number of partitions for Hadoop RDDs when not given by user
defaultParallelism() - Method in class org.apache.spark.api.java.JavaSparkContext: Default level of parallelism to use when not given by user (e.g.
defaultParallelism() - Method in class org.apache.spark.SparkContext: Default level of parallelism to use when not given by user (e.g.
defaultParamMap() - Method in interface org.apache.spark.ml.param.Params: Internal param map for default values.
defaultParams(String) - Static method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
defaultParams(Enumeration.Value) - Static method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
defaultPartitioner(RDD<?>, Seq<RDD<?>>) - Static method in class org.apache.spark.Partitioner: Choose a partitioner to use for a cogroup-like operation between a number of RDDs.
defaultSize() - Method in class org.apache.spark.sql.types.ArrayType: The default size of a value of the ArrayType is 100 * the default size of the element type.
defaultSize() - Method in class org.apache.spark.sql.types.BinaryType: The default size of a value of the BinaryType is 4096 bytes.
defaultSize() - Method in class org.apache.spark.sql.types.BooleanType: The default size of a value of the BooleanType is 1 byte.
defaultSize() - Method in class org.apache.spark.sql.types.ByteType: The default size of a value of the ByteType is 1 byte.
defaultSize() - Method in class org.apache.spark.sql.types.CalendarIntervalType
defaultSize() - Method in class org.apache.spark.sql.types.DataType: The default size of a value of this data type, used internally for size estimation.
defaultSize() - Method in class org.apache.spark.sql.types.DateType: The default size of a value of the DateType is 4 bytes.
defaultSize() - Method in class org.apache.spark.sql.types.DecimalType: The default size of a value of the DecimalType is 4096 bytes.
defaultSize() - Method in class org.apache.spark.sql.types.DoubleType: The default size of a value of the DoubleType is 8 bytes.
defaultSize() - Method in class org.apache.spark.sql.types.FloatType: The default size of a value of the FloatType is 4 bytes.
defaultSize() - Method in class org.apache.spark.sql.types.IntegerType: The default size of a value of the IntegerType is 4 bytes.
defaultSize() - Method in class org.apache.spark.sql.types.LongType: The default size of a value of the LongType is 8 bytes.
defaultSize() - Method in class org.apache.spark.sql.types.MapType: The default size of a value of the MapType is 100 * (the default size of the key type + the default size of the value type).
defaultSize() - Method in class org.apache.spark.sql.types.NullType
defaultSize() - Method in class org.apache.spark.sql.types.ShortType: The default size of a value of the ShortType is 2 bytes.
defaultSize() - Method in class org.apache.spark.sql.types.StringType: The default size of a value of the StringType is 4096 bytes.
defaultSize() - Method in class org.apache.spark.sql.types.StructType: The default size of a value of the StructType is the total default sizes of all field types.
defaultSize() - Method in class org.apache.spark.sql.types.TimestampType: The default size of a value of the TimestampType is 8 bytes.
defaultSize() - Method in class org.apache.spark.sql.types.UserDefinedType: The default size of a value of the UserDefinedType is 4096 bytes.
DefaultSource - Class in org.apache.spark.ml.source.libsvm: libsvm package implements Spark SQL data source API for loading LIBSVM data as DataFrame.
DefaultSource() - Constructor for class org.apache.spark.ml.source.libsvm.DefaultSource
defaultStategy(Enumeration.Value) - Static method in class org.apache.spark.mllib.tree.configuration.Strategy
defaultStrategy(String) - Static method in class org.apache.spark.mllib.tree.configuration.Strategy: Construct a default set of parameters for DecisionTree
defaultStrategy(Enumeration.Value) - Static method in class org.apache.spark.mllib.tree.configuration.Strategy: Construct a default set of parameters for DecisionTree
defaultStrategy() - Static method in class org.apache.spark.streaming.receiver.ActorSupervisorStrategy
degree() - Method in class org.apache.spark.ml.feature.PolynomialExpansion: The polynomial degree to expand, which should be >= 1.
degrees() - Method in class org.apache.spark.graphx.GraphOps: The degree of each vertex in the graph.
degreesOfFreedom() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
degreesOfFreedom() - Method in class org.apache.spark.mllib.stat.test.KolmogorovSmirnovTestResult
degreesOfFreedom() - Method in interface org.apache.spark.mllib.stat.test.TestResult: Returns the degree(s) of freedom of the hypothesis test.
delegate() - Method in class org.apache.spark.InterruptibleIterator
dense(int, int, double[]) - Static method in class org.apache.spark.mllib.linalg.Matrices: Creates a column-major dense matrix.
dense(double, double...) - Static method in class org.apache.spark.mllib.linalg.Vectors: Creates a dense vector from its values.
dense(double, Seq<Object>) - Static method in class org.apache.spark.mllib.linalg.Vectors: Creates a dense vector from its values.
dense(double[]) - Static method in class org.apache.spark.mllib.linalg.Vectors: Creates a dense vector from a double array.
dense_rank() - Static method in class org.apache.spark.sql.functions: Window function: returns the rank of rows within a window partition, without any gaps.
DenseMatrix - Class in org.apache.spark.mllib.linalg: Column-major dense matrix.
DenseMatrix(int, int, double[], boolean) - Constructor for class org.apache.spark.mllib.linalg.DenseMatrix
DenseMatrix(int, int, double[]) - Constructor for class org.apache.spark.mllib.linalg.DenseMatrix: Column-major dense matrix.
denseRank() - Static method in class org.apache.spark.sql.functions: Deprecated.
As of 1.6.0, replaced by dense_rank. This will be removed in Spark 2.0.
DenseVector - Class in org.apache.spark.mllib.linalg: A dense vector represented by a value array.
DenseVector(double[]) - Constructor for class org.apache.spark.mllib.linalg.DenseVector
dependencies() - Method in class org.apache.spark.rdd.RDD: Get the list of dependencies of this RDD, taking into account whether the RDD is checkpointed or not.
dependencies() - Method in class org.apache.spark.streaming.dstream.DStream: List of parent DStreams on which this DStream depends on
dependencies() - Method in class org.apache.spark.streaming.dstream.InputDStream
Dependency<T> - Class in org.apache.spark: :: DeveloperApi :: Base class for dependencies.
Dependency() - Constructor for class org.apache.spark.Dependency
depth() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel: Get depth of tree.
DerbyDialect - Class in org.apache.spark.sql.jdbc
DerbyDialect() - Constructor for class org.apache.spark.sql.jdbc.DerbyDialect
desc() - Method in class org.apache.spark.sql.Column: Returns an ordering used in sorting.
desc(String) - Static method in class org.apache.spark.sql.functions: Returns a sort expression based on the descending order of the column.
desc() - Method in class org.apache.spark.util.MethodIdentifier
describe(String...) - Method in class org.apache.spark.sql.DataFrame: Computes statistics for numeric columns, including count, mean, stddev, min, and max.
describe(Seq<String>) - Method in class org.apache.spark.sql.DataFrame: Computes statistics for numeric columns, including count, mean, stddev, min, and max.
describeTopics(int) - Method in class org.apache.spark.ml.clustering.LDAModel: Return the topics described by their top-weighted terms.
describeTopics() - Method in class org.apache.spark.ml.clustering.LDAModel
describeTopics(int) - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
describeTopics(int) - Method in class org.apache.spark.mllib.clustering.LDAModel: Return the topics described by weighted terms.
describeTopics() - Method in class org.apache.spark.mllib.clustering.LDAModel: Return the topics described by weighted terms.
describeTopics(int) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
description() - Method in class org.apache.spark.ExceptionFailure
description() - Method in class org.apache.spark.status.api.v1.JobData
description() - Method in class org.apache.spark.storage.StorageLevel
description() - Method in class org.apache.spark.streaming.scheduler.OutputOperationInfo
DeserializationStream - Class in org.apache.spark.serializer: :: DeveloperApi :: A stream for reading serialized objects.
DeserializationStream() - Constructor for class org.apache.spark.serializer.DeserializationStream
deserialize(Object) - Method in class org.apache.spark.mllib.linalg.VectorUDT
deserialize(ByteBuffer, ClassLoader, ClassTag<T>) - Method in class org.apache.spark.serializer.DummySerializerInstance
deserialize(ByteBuffer, ClassTag<T>) - Method in class org.apache.spark.serializer.DummySerializerInstance
deserialize(ByteBuffer, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializerInstance
deserialize(ByteBuffer, ClassLoader, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializerInstance
deserialize(Object) - Method in class org.apache.spark.sql.types.UserDefinedType: Convert a SQL datum to the user type
deserialized() - Method in class org.apache.spark.storage.MemoryEntry
deserialized() - Method in class org.apache.spark.storage.StorageLevel
deserializeStream(InputStream) - Method in class org.apache.spark.serializer.DummySerializerInstance
deserializeStream(InputStream) - Method in class org.apache.spark.serializer.SerializerInstance
destroy() - Method in class org.apache.spark.broadcast.Broadcast: Destroy all data and metadata related to this broadcast variable.
details() - Method in class org.apache.spark.scheduler.StageInfo
details() - Method in class org.apache.spark.status.api.v1.StageData
determineBounds(ArrayBuffer<Tuple2<K, Object>>, int, Ordering<K>, ClassTag<K>) - Static method in class org.apache.spark.RangePartitioner: Determines the bounds for range partitioning from candidates with weights indicating how many items each represents.
deterministic() - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction: Returns true iff this function is deterministic, i.e.
DeveloperApi - Annotation Type in org.apache.spark.annotation: A lower-level, unstable API intended for developers.
devianceResiduals() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary: The weighted residuals, the usual residuals rescaled by the square root of the instance weights.
diag(Vector) - Static method in class org.apache.spark.mllib.linalg.DenseMatrix: Generate a diagonal matrix in DenseMatrix format from the supplied values.
diag(Vector) - Static method in class org.apache.spark.mllib.linalg.Matrices: Generate a diagonal matrix in Matrix format from the supplied values.
dialectClassName() - Method in class org.apache.spark.sql.SQLContext
diff(RDD<Tuple2<Object, VD>>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
diff(VertexRDD<VD>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
diff(RDD<Tuple2<Object, VD>>) - Method in class org.apache.spark.graphx.VertexRDD: For each vertex present in both this and other, diff returns only those vertices with differing values; for values that are different, keeps the values from other.
diff(VertexRDD<VD>) - Method in class org.apache.spark.graphx.VertexRDD: For each vertex present in both this and other, diff returns only those vertices with differing values; for values that are different, keeps the values from other.
disableOutputSpecValidation() - Static method in class org.apache.spark.rdd.PairRDDFunctions
disconnect() - Method in interface org.apache.spark.launcher.SparkAppHandle: Disconnects the handle from the application, without stopping it.
DISK_ONLY - Static variable in class org.apache.spark.api.java.StorageLevels
DISK_ONLY() - Static method in class org.apache.spark.storage.StorageLevel
DISK_ONLY_2 - Static variable in class org.apache.spark.api.java.StorageLevels
DISK_ONLY_2() - Static method in class org.apache.spark.storage.StorageLevel
diskBytesSpilled() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
diskBytesSpilled() - Method in class org.apache.spark.status.api.v1.StageData
diskBytesSpilled() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
diskBytesSpilled() - Method in class org.apache.spark.status.api.v1.TaskMetrics
diskSize() - Method in class org.apache.spark.storage.BlockStatus
diskSize() - Method in class org.apache.spark.storage.BlockUpdatedInfo
diskSize() - Method in class org.apache.spark.storage.RDDInfo
diskUsed() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
diskUsed() - Method in class org.apache.spark.status.api.v1.RDDDataDistribution
diskUsed() - Method in class org.apache.spark.status.api.v1.RDDPartitionInfo
diskUsed() - Method in class org.apache.spark.status.api.v1.RDDStorageInfo
diskUsed() - Method in class org.apache.spark.storage.StorageStatus: Return the disk space used by this block manager.
diskUsedByRdd(int) - Method in class org.apache.spark.storage.StorageStatus: Return the disk space used by the given RDD in this block manager in O(1) time.
dist(Vector) - Method in class org.apache.spark.util.Vector
distinct() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a new RDD containing the distinct elements in this RDD.
distinct(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a new RDD containing the distinct elements in this RDD.
distinct() - Method in class org.apache.spark.api.java.JavaPairRDD: Return a new RDD containing the distinct elements in this RDD.
distinct(int) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a new RDD containing the distinct elements in this RDD.
distinct() - Method in class org.apache.spark.api.java.JavaRDD: Return a new RDD containing the distinct elements in this RDD.
distinct(int) - Method in class org.apache.spark.api.java.JavaRDD: Return a new RDD containing the distinct elements in this RDD.
distinct(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Return a new RDD containing the distinct elements in this RDD.
distinct() - Method in class org.apache.spark.rdd.RDD: Return a new RDD containing the distinct elements in this RDD.
distinct() - Method in class org.apache.spark.sql.DataFrame: Returns a new DataFrame that contains only the unique rows from this DataFrame.
distinct() - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset that contains only the unique elements of this Dataset.
distinct(Column...) - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction: Creates a Column for this UDAF using the distinct values of the given Columns as input arguments.
distinct(Seq<Column>) - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction: Creates a Column for this UDAF using the distinct values of the given Columns as input arguments.
DistributedLDAModel - Class in org.apache.spark.ml.clustering: :: Experimental ::
DistributedLDAModel - Class in org.apache.spark.mllib.clustering
DistributedMatrix - Interface in org.apache.spark.mllib.linalg.distributed: Represents a distributively stored matrix backed by one or more RDDs.
div(Duration) - Method in class org.apache.spark.streaming.Duration
divide(Object) - Method in class org.apache.spark.sql.Column: Division this expression by another expression.
divide(double) - Method in class org.apache.spark.util.Vector
doc() - Method in class org.apache.spark.ml.param.Param
docConcentration() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
docConcentration() - Method in class org.apache.spark.mllib.clustering.EMLDAOptimizer
docConcentration() - Method in class org.apache.spark.mllib.clustering.LDAModel: Concentration parameter (commonly named "alpha") for the prior placed on documents' distributions over topics ("theta").
docConcentration() - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
doDestroy(boolean) - Method in class org.apache.spark.broadcast.Broadcast: Actually destroy all data and metadata related to this broadcast variable.
dot(Vector) - Method in class org.apache.spark.util.Vector
DOUBLE() - Static method in class org.apache.spark.sql.Encoders: An encoder for nullable double type.
doubleAccumulator(double) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulator double variable, which tasks can "add" values to using the add method.
doubleAccumulator(double, String) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulator double variable, which tasks can "add" values to using the add method.
DoubleArrayParam - Class in org.apache.spark.ml.param: :: DeveloperApi :: Specialized version of Param[Array[Double} for Java.
DoubleArrayParam(Params, String, String, Function1<double[], Object>) - Constructor for class org.apache.spark.ml.param.DoubleArrayParam
DoubleArrayParam(Params, String, String) - Constructor for class org.apache.spark.ml.param.DoubleArrayParam
DoubleDecimal() - Static method in class org.apache.spark.sql.types.DecimalType
DoubleFlatMapFunction<T> - Interface in org.apache.spark.api.java.function: A function that returns zero or more records of type Double from each input record.
DoubleFunction<T> - Interface in org.apache.spark.api.java.function: A function that returns Doubles, and can be used to construct DoubleRDDs.
DoubleParam - Class in org.apache.spark.ml.param: :: DeveloperApi :: Specialized version of Param[Double] for Java.
DoubleParam(String, String, String, Function1<Object, Object>) - Constructor for class org.apache.spark.ml.param.DoubleParam
DoubleParam(String, String, String) - Constructor for class org.apache.spark.ml.param.DoubleParam
DoubleParam(Identifiable, String, String, Function1<Object, Object>) - Constructor for class org.apache.spark.ml.param.DoubleParam
DoubleParam(Identifiable, String, String) - Constructor for class org.apache.spark.ml.param.DoubleParam
DoubleRDDFunctions - Class in org.apache.spark.rdd: Extra functions available on RDDs of Doubles through an implicit conversion.
DoubleRDDFunctions(RDD<Object>) - Constructor for class org.apache.spark.rdd.DoubleRDDFunctions
doubleRDDToDoubleRDDFunctions(RDD<Object>) - Static method in class org.apache.spark.rdd.RDD
doubleRDDToDoubleRDDFunctions(RDD<Object>) - Static method in class org.apache.spark.SparkContext
doubleToDoubleWritable(double) - Static method in class org.apache.spark.SparkContext
doubleToMultiplier(double) - Static method in class org.apache.spark.util.Vector
DoubleType - Static variable in class org.apache.spark.sql.types.DataTypes: Gets the DoubleType object.
DoubleType - Class in org.apache.spark.sql.types: :: DeveloperApi :: The data type representing Double values.
doubleWritableConverter() - Static method in class org.apache.spark.SparkContext
doUnpersist(boolean) - Method in class org.apache.spark.broadcast.Broadcast: Actually unpersist the broadcasted value on the executors.
DRIVER_EXTRA_CLASSPATH - Static variable in class org.apache.spark.launcher.SparkLauncher: Configuration key for the driver class path.
DRIVER_EXTRA_JAVA_OPTIONS - Static variable in class org.apache.spark.launcher.SparkLauncher: Configuration key for the driver VM options.
DRIVER_EXTRA_LIBRARY_PATH - Static variable in class org.apache.spark.launcher.SparkLauncher: Configuration key for the driver native library path.
DRIVER_IDENTIFIER() - Static method in class org.apache.spark.SparkContext: Executor id for the driver.
DRIVER_MEMORY - Static variable in class org.apache.spark.launcher.SparkLauncher: Configuration key for the driver memory.
driverActorSystemName() - Static method in class org.apache.spark.SparkEnv
driverLogs() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
drop(String) - Method in class org.apache.spark.sql.DataFrame: Returns a new DataFrame with a column dropped.
drop(Column) - Method in class org.apache.spark.sql.DataFrame: Returns a new DataFrame with a column dropped.
drop() - Method in class org.apache.spark.sql.DataFrameNaFunctions: Returns a new DataFrame that drops rows containing any null or NaN values.
drop(String) - Method in class org.apache.spark.sql.DataFrameNaFunctions: Returns a new DataFrame that drops rows containing null or NaN values.
drop(String[]) - Method in class org.apache.spark.sql.DataFrameNaFunctions: Returns a new DataFrame that drops rows containing any null or NaN values in the specified columns.
drop(Seq<String>) - Method in class org.apache.spark.sql.DataFrameNaFunctions: (Scala-specific) Returns a new DataFrame that drops rows containing any null or NaN values in the specified columns.
drop(String, String[]) - Method in class org.apache.spark.sql.DataFrameNaFunctions: Returns a new DataFrame that drops rows containing null or NaN values in the specified columns.
drop(String, Seq<String>) - Method in class org.apache.spark.sql.DataFrameNaFunctions: (Scala-specific) Returns a new DataFrame that drops rows containing null or NaN values in the specified columns.
drop(int) - Method in class org.apache.spark.sql.DataFrameNaFunctions: Returns a new DataFrame that drops rows containing less than minNonNulls non-null and non-NaN values.
drop(int, String[]) - Method in class org.apache.spark.sql.DataFrameNaFunctions: Returns a new DataFrame that drops rows containing less than minNonNulls non-null and non-NaN values in the specified columns.
drop(int, Seq<String>) - Method in class org.apache.spark.sql.DataFrameNaFunctions: (Scala-specific) Returns a new DataFrame that drops rows containing less than minNonNulls non-null and non-NaN values in the specified columns.
dropDuplicates() - Method in class org.apache.spark.sql.DataFrame: Returns a new DataFrame that contains only the unique rows from this DataFrame.
dropDuplicates(Seq<String>) - Method in class org.apache.spark.sql.DataFrame: (Scala-specific) Returns a new DataFrame with duplicate rows removed, considering only the subset of columns.
dropDuplicates(String[]) - Method in class org.apache.spark.sql.DataFrame: Returns a new DataFrame with duplicate rows removed, considering only the subset of columns.
dropLast() - Method in class org.apache.spark.ml.feature.OneHotEncoder: Whether to drop the last category in the encoded vector (default: true)
dropTempTable(String) - Method in class org.apache.spark.sql.SQLContext
Dst - Static variable in class org.apache.spark.graphx.TripletFields: Expose the destination and edge fields but not the source field.
dstAttr() - Method in class org.apache.spark.graphx.EdgeContext: The vertex attribute of the edge's destination vertex.
dstAttr() - Method in class org.apache.spark.graphx.EdgeTriplet: The destination vertex attribute
dstAttr() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
dstId() - Method in class org.apache.spark.graphx.Edge
dstId() - Method in class org.apache.spark.graphx.EdgeContext: The vertex id of the edge's destination vertex.
dstId() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
dstream() - Method in class org.apache.spark.streaming.api.java.JavaDStream
dstream() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
dstream() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
DStream<T> - Class in org.apache.spark.streaming.dstream: A Discretized Stream (DStream), the basic abstraction in Spark Streaming, is a continuous sequence of RDDs (of the same type) representing a continuous stream of data (see org.apache.spark.rdd.RDD in the Spark core documentation for more details on RDDs).
DStream(StreamingContext, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.DStream
dtypes() - Method in class org.apache.spark.sql.DataFrame: Returns all column names and their data types as an array.
DummySerializerInstance - Class in org.apache.spark.serializer: Unfortunately, we need a serializer instance in order to construct a DiskBlockObjectWriter.
duration() - Method in class org.apache.spark.scheduler.TaskInfo
Duration - Class in org.apache.spark.streaming
Duration(long) - Constructor for class org.apache.spark.streaming.Duration
duration() - Method in class org.apache.spark.streaming.scheduler.OutputOperationInfo: Return the duration of this output operation.
Durations - Class in org.apache.spark.streaming
Durations() - Constructor for class org.apache.spark.streaming.Durations

E

Edge<ED> - Class in org.apache.spark.graphx: A single directed edge consisting of a source id, target id, and the data associated with the edge.
Edge(long, long, ED) - Constructor for class org.apache.spark.graphx.Edge
EdgeActiveness - Enum in org.apache.spark.graphx.impl: Criteria for filtering edges based on activeness.
EdgeContext<VD,ED,A> - Class in org.apache.spark.graphx: Represents an edge along with its neighboring vertices and allows sending messages along the edge.
EdgeContext() - Constructor for class org.apache.spark.graphx.EdgeContext
EdgeDirection - Class in org.apache.spark.graphx: The direction of a directed edge relative to a vertex.
edgeListFile(SparkContext, String, boolean, int, StorageLevel, StorageLevel) - Static method in class org.apache.spark.graphx.GraphLoader: Loads a graph from an edge list formatted file where each line contains two integers: a source id and a target id.
EdgeOnly - Static variable in class org.apache.spark.graphx.TripletFields: Expose only the edge field and not the source or destination field.
EdgeRDD<ED> - Class in org.apache.spark.graphx: EdgeRDD[ED, VD] extends RDD[Edge[ED} by storing the edges in columnar format on each partition for performance.
EdgeRDD(SparkContext, Seq<Dependency<?>>) - Constructor for class org.apache.spark.graphx.EdgeRDD
EdgeRDDImpl<ED,VD> - Class in org.apache.spark.graphx.impl
edges() - Method in class org.apache.spark.graphx.Graph: An RDD containing the edges and their associated attributes.
edges() - Method in class org.apache.spark.graphx.impl.GraphImpl
EdgeTriplet<VD,ED> - Class in org.apache.spark.graphx: An edge triplet represents an edge along with the vertex attributes of its neighboring vertices.
EdgeTriplet() - Constructor for class org.apache.spark.graphx.EdgeTriplet
Either() - Static method in class org.apache.spark.graphx.EdgeDirection: Edges originating from *or* arriving at a vertex of interest.
elements() - Method in class org.apache.spark.util.Vector
elementType() - Method in class org.apache.spark.sql.types.ArrayType
ElementwiseProduct - Class in org.apache.spark.ml.feature: :: Experimental :: Outputs the Hadamard product (i.e., the element-wise product) of each input vector with a provided "weight" vector.
ElementwiseProduct(String) - Constructor for class org.apache.spark.ml.feature.ElementwiseProduct
ElementwiseProduct() - Constructor for class org.apache.spark.ml.feature.ElementwiseProduct
ElementwiseProduct - Class in org.apache.spark.mllib.feature: Outputs the Hadamard product (i.e., the element-wise product) of each input vector with a provided "weight" vector.
ElementwiseProduct(Vector) - Constructor for class org.apache.spark.mllib.feature.ElementwiseProduct
EMLDAOptimizer - Class in org.apache.spark.mllib.clustering: :: DeveloperApi ::
EMLDAOptimizer() - Constructor for class org.apache.spark.mllib.clustering.EMLDAOptimizer
empty() - Static method in class org.apache.spark.ml.param.ParamMap: Returns an empty param map.
empty() - Static method in class org.apache.spark.sql.types.Metadata: Returns an empty Metadata.
empty() - Static method in class org.apache.spark.storage.BlockStatus
emptyDataFrame() - Method in class org.apache.spark.sql.SQLContext: :: Experimental :: Returns a DataFrame with no rows or columns.
emptyNode(int) - Static method in class org.apache.spark.mllib.tree.model.Node: Return a node with the given node id (but nothing else set).
emptyRDD() - Method in class org.apache.spark.api.java.JavaSparkContext: Get an RDD that has no partitions or elements.
emptyRDD(ClassTag<T>) - Method in class org.apache.spark.SparkContext: Get an RDD that has no partitions or elements.
emptyResult() - Method in class org.apache.spark.sql.SQLContext
encode(Column, String) - Static method in class org.apache.spark.sql.functions: Computes the first argument into a binary from a string using the provided character set (one of 'US-ASCII', 'ISO-8859-1', 'UTF-8', 'UTF-16BE', 'UTF-16LE', 'UTF-16').
encodeLabeledPoint(LabeledPoint, int) - Static method in class org.apache.spark.ml.classification.LabelConverter: Encodes a label as a vector.
Encoder<T> - Interface in org.apache.spark.sql: :: Experimental :: Used to convert a JVM object of type T to and from the internal Spark SQL representation.
encoder() - Method in class org.apache.spark.sql.TypedColumn
Encoders - Class in org.apache.spark.sql: :: Experimental :: Methods for creating an Encoder.
Encoders() - Constructor for class org.apache.spark.sql.Encoders
endsWith(Column) - Method in class org.apache.spark.sql.Column: String ends with.
endsWith(String) - Method in class org.apache.spark.sql.Column: String ends with another string literal.
endTime() - Method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
endTime() - Method in class org.apache.spark.streaming.scheduler.OutputOperationInfo
endTime() - Method in class org.apache.spark.ui.jobs.JobProgressListener
entries() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
Entropy - Class in org.apache.spark.mllib.tree.impurity: :: Experimental :: Class for calculating entropy during binary classification.
Entropy() - Constructor for class org.apache.spark.mllib.tree.impurity.Entropy
EnumUtil - Class in org.apache.spark.util
EnumUtil() - Constructor for class org.apache.spark.util.EnumUtil
env() - Method in class org.apache.spark.api.java.JavaSparkContext
env() - Method in class org.apache.spark.streaming.StreamingContext
environmentDetails() - Method in class org.apache.spark.scheduler.SparkListenerEnvironmentUpdate
EnvironmentListener - Class in org.apache.spark.ui.env: :: DeveloperApi :: A SparkListener that prepares information to be displayed on the EnvironmentTab
EnvironmentListener() - Constructor for class org.apache.spark.ui.env.EnvironmentListener
EPSILON() - Static method in class org.apache.spark.mllib.util.MLUtils
eqNullSafe(Object) - Method in class org.apache.spark.sql.Column: Equality test that is safe for null values.
EqualNullSafe - Class in org.apache.spark.sql.sources: Performs equality comparison, similar to EqualTo.
EqualNullSafe(String, Object) - Constructor for class org.apache.spark.sql.sources.EqualNullSafe
equals(Object) - Method in class org.apache.spark.graphx.EdgeDirection
equals(Object) - Method in class org.apache.spark.HashPartitioner
equals(Object) - Method in class org.apache.spark.ml.attribute.AttributeGroup
equals(Object) - Method in class org.apache.spark.ml.attribute.BinaryAttribute
equals(Object) - Method in class org.apache.spark.ml.attribute.NominalAttribute
equals(Object) - Method in class org.apache.spark.ml.attribute.NumericAttribute
equals(Object) - Method in class org.apache.spark.ml.param.Param
equals(Object) - Method in class org.apache.spark.ml.tree.CategoricalSplit
equals(Object) - Method in class org.apache.spark.ml.tree.ContinuousSplit
equals(Object) - Method in class org.apache.spark.mllib.linalg.DenseMatrix
equals(Object) - Method in class org.apache.spark.mllib.linalg.SparseMatrix
equals(Object) - Method in interface org.apache.spark.mllib.linalg.Vector
equals(Object) - Method in class org.apache.spark.mllib.linalg.VectorUDT
equals(Object) - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
equals(Object) - Method in class org.apache.spark.mllib.tree.model.Predict
equals(Object) - Method in class org.apache.spark.RangePartitioner
equals(Object) - Method in class org.apache.spark.scheduler.AccumulableInfo
equals(Object) - Method in class org.apache.spark.scheduler.cluster.ExecutorInfo
equals(Object) - Method in class org.apache.spark.scheduler.InputFormatInfo
equals(Object) - Method in class org.apache.spark.scheduler.SplitInfo
equals(Object) - Method in class org.apache.spark.sql.Column
equals(Object) - Method in interface org.apache.spark.sql.Row
equals(Object) - Method in class org.apache.spark.sql.types.Decimal
equals(Object) - Method in class org.apache.spark.sql.types.Metadata
equals(Object) - Method in class org.apache.spark.storage.BlockId
equals(Object) - Method in class org.apache.spark.storage.BlockManagerId
equals(Object) - Method in class org.apache.spark.storage.StorageLevel
equals(Object) - Method in class org.apache.spark.streaming.kafka.Broker: Broker's port
equals(Object) - Method in class org.apache.spark.streaming.kafka.OffsetRange
equalTo(Object) - Method in class org.apache.spark.sql.Column: Equality test.
EqualTo - Class in org.apache.spark.sql.sources: A filter that evaluates to true iff the attribute evaluates to a value equal to value.
EqualTo(String, Object) - Constructor for class org.apache.spark.sql.sources.EqualTo
errorMessage() - Method in class org.apache.spark.status.api.v1.TaskData
estimate(double[]) - Method in class org.apache.spark.mllib.stat.KernelDensity: Estimates probability density function at the given array of points.
estimate(Object) - Static method in class org.apache.spark.util.SizeEstimator: Estimate the number of bytes that the given object takes up on the JVM heap.
estimatedDocConcentration() - Method in class org.apache.spark.ml.clustering.LDAModel: Value for docConcentration estimated from data.
Estimator<M extends Model<M>> - Class in org.apache.spark.ml: :: DeveloperApi :: Abstract class for estimators that fit models to data.
Estimator() - Constructor for class org.apache.spark.ml.Estimator
evaluate(DataFrame) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
evaluate(DataFrame, ParamMap) - Method in class org.apache.spark.ml.evaluation.Evaluator: Evaluates model output and returns a scalar metric (larger is better).
evaluate(DataFrame) - Method in class org.apache.spark.ml.evaluation.Evaluator: Evaluates the output.
evaluate(DataFrame) - Method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
evaluate(DataFrame) - Method in class org.apache.spark.ml.evaluation.RegressionEvaluator
evaluate(Row) - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction: Calculates the final result of this UserDefinedAggregateFunction based on the given aggregation buffer.
evaluateEachIteration(RDD<LabeledPoint>, Loss) - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel: Method to compute error or loss for every iteration of gradient boosting.
Evaluator - Class in org.apache.spark.ml.evaluation: :: DeveloperApi :: Abstract class for evaluators that compute metrics from predictions.
Evaluator() - Constructor for class org.apache.spark.ml.evaluation.Evaluator
event() - Method in class org.apache.spark.streaming.flume.SparkFlumeEvent
except(DataFrame) - Method in class org.apache.spark.sql.DataFrame: Returns a new DataFrame containing rows in this frame but not in another frame.
exception() - Method in class org.apache.spark.ExceptionFailure
exception() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformationWriterThread: Contains the exception thrown while writing the parent iterator to the external process.
ExceptionFailure - Class in org.apache.spark: :: DeveloperApi :: Task failed due to a runtime exception.
ExceptionFailure(String, String, StackTraceElement[], String, Option<TaskMetrics>, Option<ThrowableSerializationWrapper>) - Constructor for class org.apache.spark.ExceptionFailure
execId() - Method in class org.apache.spark.ExecutorLostFailure
execId() - Method in class org.apache.spark.scheduler.SparkListenerExecutorMetricsUpdate
executePlan(LogicalPlan) - Method in class org.apache.spark.sql.hive.HiveContext
executePlan(LogicalPlan) - Method in class org.apache.spark.sql.SQLContext
executeSql(String) - Method in class org.apache.spark.sql.SQLContext
executionHive() - Method in class org.apache.spark.sql.hive.HiveContext: The copy of the hive client that is used for execution.
ExecutionListenerManager - Class in org.apache.spark.sql.util: :: Experimental ::
EXECUTOR_CORES - Static variable in class org.apache.spark.launcher.SparkLauncher: Configuration key for the number of executor CPU cores.
EXECUTOR_EXTRA_CLASSPATH - Static variable in class org.apache.spark.launcher.SparkLauncher: Configuration key for the executor class path.
EXECUTOR_EXTRA_JAVA_OPTIONS - Static variable in class org.apache.spark.launcher.SparkLauncher: Configuration key for the executor VM options.
EXECUTOR_EXTRA_LIBRARY_PATH - Static variable in class org.apache.spark.launcher.SparkLauncher: Configuration key for the executor native library path.
EXECUTOR_MEMORY - Static variable in class org.apache.spark.launcher.SparkLauncher: Configuration key for the executor memory.
executorActorSystemName() - Static method in class org.apache.spark.SparkEnv
executorDeserializeTime() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
executorDeserializeTime() - Method in class org.apache.spark.status.api.v1.TaskMetrics
executorEnvs() - Method in class org.apache.spark.SparkContext
executorHost() - Method in class org.apache.spark.scheduler.cluster.ExecutorInfo
executorId() - Method in class org.apache.spark.ExecutorRegistered
executorId() - Method in class org.apache.spark.ExecutorRemoved
executorId() - Method in class org.apache.spark.scheduler.SparkListenerExecutorAdded
executorId() - Method in class org.apache.spark.scheduler.SparkListenerExecutorRemoved
executorId() - Method in class org.apache.spark.scheduler.TaskInfo
executorId() - Method in class org.apache.spark.SparkEnv
executorId() - Method in class org.apache.spark.status.api.v1.TaskData
executorId() - Method in class org.apache.spark.storage.BlockManagerId
executorId() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
executorIdToBlockManagerId() - Method in class org.apache.spark.ui.jobs.JobProgressListener
executorIdToData() - Method in class org.apache.spark.ui.exec.ExecutorsListener
executorIdToStorageStatus() - Method in class org.apache.spark.storage.StorageStatusListener
ExecutorInfo - Class in org.apache.spark.scheduler.cluster: :: DeveloperApi :: Stores information about an executor to pass from the scheduler to SparkListeners.
ExecutorInfo(String, int, Map<String, String>) - Constructor for class org.apache.spark.scheduler.cluster.ExecutorInfo
executorInfo() - Method in class org.apache.spark.scheduler.SparkListenerExecutorAdded
executorLogs() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
ExecutorLostFailure - Class in org.apache.spark: :: DeveloperApi :: The task failed because the executor that it was running on was lost.
ExecutorLostFailure(String, boolean, Option<String>) - Constructor for class org.apache.spark.ExecutorLostFailure
executorPct() - Method in class org.apache.spark.scheduler.RuntimePercentage
ExecutorRegistered - Class in org.apache.spark
ExecutorRegistered(String) - Constructor for class org.apache.spark.ExecutorRegistered
ExecutorRemoved - Class in org.apache.spark
ExecutorRemoved(String) - Constructor for class org.apache.spark.ExecutorRemoved
executorRunTime() - Method in class org.apache.spark.status.api.v1.StageData
executorRunTime() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
executorRunTime() - Method in class org.apache.spark.status.api.v1.TaskMetrics
executors() - Method in class org.apache.spark.status.api.v1.RDDPartitionInfo
ExecutorsListener - Class in org.apache.spark.ui.exec: :: DeveloperApi :: A SparkListener that prepares information to be displayed on the ExecutorsTab
ExecutorsListener(StorageStatusListener) - Constructor for class org.apache.spark.ui.exec.ExecutorsListener
ExecutorStageSummary - Class in org.apache.spark.status.api.v1
ExecutorSummary - Class in org.apache.spark.status.api.v1
executorSummary() - Method in class org.apache.spark.status.api.v1.StageData
executorToDuration() - Method in class org.apache.spark.ui.exec.ExecutorsListener
executorToInputBytes() - Method in class org.apache.spark.ui.exec.ExecutorsListener
executorToInputRecords() - Method in class org.apache.spark.ui.exec.ExecutorsListener
executorToLogUrls() - Method in class org.apache.spark.ui.exec.ExecutorsListener
executorToOutputBytes() - Method in class org.apache.spark.ui.exec.ExecutorsListener
executorToOutputRecords() - Method in class org.apache.spark.ui.exec.ExecutorsListener
executorToShuffleRead() - Method in class org.apache.spark.ui.exec.ExecutorsListener
executorToShuffleWrite() - Method in class org.apache.spark.ui.exec.ExecutorsListener
executorToTasksActive() - Method in class org.apache.spark.ui.exec.ExecutorsListener
executorToTasksComplete() - Method in class org.apache.spark.ui.exec.ExecutorsListener
executorToTasksFailed() - Method in class org.apache.spark.ui.exec.ExecutorsListener
exists() - Method in class org.apache.spark.streaming.State: Whether the state already exists
exitCausedByApp() - Method in class org.apache.spark.ExecutorLostFailure
exp(Column) - Static method in class org.apache.spark.sql.functions: Computes the exponential of the given value.
exp(String) - Static method in class org.apache.spark.sql.functions: Computes the exponential of the given column.
ExpectationSum - Class in org.apache.spark.mllib.clustering
ExpectationSum(double, double[], DenseVector<Object>[], DenseMatrix<Object>[]) - Constructor for class org.apache.spark.mllib.clustering.ExpectationSum
Experimental - Annotation Type in org.apache.spark.annotation: An experimental user-facing API.
experimental() - Method in class org.apache.spark.sql.SQLContext: :: Experimental :: A collection of methods that are considered experimental, but can be used to hook into the query planner for advanced functionality.
ExperimentalMethods - Class in org.apache.spark.sql: :: Experimental :: Holder for experimental methods for the bravest.
ExperimentalMethods(SQLContext) - Constructor for class org.apache.spark.sql.ExperimentalMethods
explain(boolean) - Method in class org.apache.spark.sql.Column: Prints the expression to the console for debugging purpose.
explain(boolean) - Method in class org.apache.spark.sql.DataFrame: Prints the plans (logical and physical) to the console for debugging purposes.
explain() - Method in class org.apache.spark.sql.DataFrame: Prints the physical plan to the console for debugging purposes.
explain(boolean) - Method in class org.apache.spark.sql.Dataset: Prints the plans (logical and physical) to the console for debugging purposes.
explain() - Method in class org.apache.spark.sql.Dataset: Prints the physical plan to the console for debugging purposes.
explainedVariance() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary
explainedVariance() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics: Returns the variance explained by regression.
explainParam(Param<?>) - Method in interface org.apache.spark.ml.param.Params
explainParams() - Method in interface org.apache.spark.ml.param.Params
explode(Seq<Column>, Function1<Row, TraversableOnce<A>>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.DataFrame: (Scala-specific) Returns a new DataFrame where each row has been expanded to zero or more rows by the provided function.
explode(String, String, Function1<A, TraversableOnce>, TypeTags.TypeTag) - Method in class org.apache.spark.sql.DataFrame: (Scala-specific) Returns a new DataFrame where a single column has been expanded to zero or more rows by the provided function.
explode(Column) - Static method in class org.apache.spark.sql.functions: Creates a new row for each element in the given array or map column.
expm1(Column) - Static method in class org.apache.spark.sql.functions: Computes the exponential of the given value minus one.
expm1(String) - Static method in class org.apache.spark.sql.functions: Computes the exponential of the given column.
ExponentialGenerator - Class in org.apache.spark.mllib.random: :: DeveloperApi :: Generates i.i.d.
ExponentialGenerator(double) - Constructor for class org.apache.spark.mllib.random.ExponentialGenerator
exponentialJavaRDD(JavaSparkContext, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Java-friendly version of RandomRDDs.exponentialRDD(org.apache.spark.SparkContext, double, long, int, long).
exponentialJavaRDD(JavaSparkContext, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.exponentialJavaRDD(org.apache.spark.api.java.JavaSparkContext, double, long, int, long) with the default seed.
exponentialJavaRDD(JavaSparkContext, double, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.exponentialJavaRDD(org.apache.spark.api.java.JavaSparkContext, double, long, int, long) with the default number of partitions and the default seed.
exponentialJavaVectorRDD(JavaSparkContext, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Java-friendly version of RandomRDDs.exponentialVectorRDD(org.apache.spark.SparkContext, double, long, int, int, long).
exponentialJavaVectorRDD(JavaSparkContext, double, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.exponentialJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, double, long, int, int, long) with the default seed.
exponentialJavaVectorRDD(JavaSparkContext, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.exponentialJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, double, long, int, int, long) with the default number of partitions and the default seed.
exponentialRDD(SparkContext, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Generates an RDD comprised of i.i.d. samples from the exponential distribution with the input mean.
exponentialVectorRDD(SparkContext, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Generates an RDD[Vector] with vectors containing i.i.d. samples drawn from the exponential distribution with the input mean.
expr() - Method in class org.apache.spark.sql.Column
expr(String) - Static method in class org.apache.spark.sql.functions: Parses the expression string into the column that it represents, similar to DataFrame.selectExpr
externalBlockStoreFolderName() - Method in class org.apache.spark.SparkContext
externalBlockStoreSize() - Method in class org.apache.spark.storage.BlockStatus
externalBlockStoreSize() - Method in class org.apache.spark.storage.BlockUpdatedInfo
externalBlockStoreSize() - Method in class org.apache.spark.storage.RDDInfo
extractAFTPoints(DataFrame) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression: Extract featuresCol, labelCol and censorCol from input dataset, and put it in an RDD with strong types.
extractDistribution(Function1<BatchInfo, Option<Object>>) - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
extractDoubleDistribution(Seq<Tuple2<TaskInfo, TaskMetrics>>, Function2<TaskInfo, TaskMetrics, Option<Object>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
extractLabeledPoints(DataFrame) - Method in class org.apache.spark.ml.Predictor: Extract labelCol and featuresCol from the given dataset, and put it in an RDD with strong types.
extractLongDistribution(Seq<Tuple2<TaskInfo, TaskMetrics>>, Function2<TaskInfo, TaskMetrics, Option<Object>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
extractParamMap(ParamMap) - Method in interface org.apache.spark.ml.param.Params: Extracts the embedded default param values and user-supplied values, and then merges them with extra values from input into a flat param map, where the latter value is used if there exist conflicts, i.e., with ordering: default param values < user-supplied values < extra.
extractParamMap() - Method in interface org.apache.spark.ml.param.Params: extractParamMap with no extra values.
extraStrategies() - Method in class org.apache.spark.sql.ExperimentalMethods: Allows extra strategies to be injected into the query planner at runtime.
eye(int) - Static method in class org.apache.spark.mllib.linalg.DenseMatrix: Generate an Identity Matrix in DenseMatrix format.
eye(int) - Static method in class org.apache.spark.mllib.linalg.Matrices: Generate a dense Identity Matrix in Matrix format.

F

f() - Method in class org.apache.spark.sql.UserDefinedFunction
f1Measure() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics: Returns document-based f1-measure averaged by the number of documents
f1Measure(double) - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics: Returns f1-measure for a given label (category)
factorial(Column) - Static method in class org.apache.spark.sql.functions: Computes the factorial of the given value.
failed() - Method in class org.apache.spark.scheduler.TaskInfo
failedJobs() - Method in class org.apache.spark.ui.jobs.JobProgressListener
failedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
failedTasks() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
failedTasks() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
failureReason() - Method in class org.apache.spark.scheduler.StageInfo: If the stage failed, the reason why.
failureReason() - Method in class org.apache.spark.streaming.scheduler.OutputOperationInfo
FAIR() - Static method in class org.apache.spark.scheduler.SchedulingMode
falsePositiveRate(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns false positive rate for a given label (category)
feature() - Method in class org.apache.spark.mllib.tree.model.Split
featureImportances() - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel: Estimate of the importance of each feature.
featureImportances() - Method in class org.apache.spark.ml.regression.RandomForestRegressionModel: Estimate of the importance of each feature.
featureIndex() - Method in class org.apache.spark.ml.tree.CategoricalSplit
featureIndex() - Method in class org.apache.spark.ml.tree.ContinuousSplit
featureIndex() - Method in interface org.apache.spark.ml.tree.Split: Index of feature which this split tests
features() - Method in class org.apache.spark.mllib.regression.LabeledPoint
featuresCol() - Method in class org.apache.spark.ml.classification.BinaryLogisticRegressionSummary
featuresCol() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary: Field in "predictions" which gives the features of each instance as a vector.
featuresCol() - Method in class org.apache.spark.ml.regression.LinearRegressionTrainingSummary
featuresDataType() - Method in class org.apache.spark.ml.PredictionModel: Returns the SQL DataType corresponding to the FeaturesType type parameter.
FeatureType - Class in org.apache.spark.mllib.tree.configuration: Enum to describe whether a feature is "continuous" or "categorical"
FeatureType() - Constructor for class org.apache.spark.mllib.tree.configuration.FeatureType
featureType() - Method in class org.apache.spark.mllib.tree.model.Split
FetchFailed - Class in org.apache.spark: :: DeveloperApi :: Task failed to fetch shuffle data from a remote node.
FetchFailed(BlockManagerId, int, int, int, String) - Constructor for class org.apache.spark.FetchFailed
fetchPct() - Method in class org.apache.spark.scheduler.RuntimePercentage
fetchWaitTime() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetricDistributions
fetchWaitTime() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetrics
field() - Method in class org.apache.spark.storage.BroadcastBlockId
fieldIndex(String) - Method in interface org.apache.spark.sql.Row: Returns the index of a given field name.
fieldIndex(String) - Method in class org.apache.spark.sql.types.StructType: Returns index of a given field
fieldNames() - Method in class org.apache.spark.sql.types.StructType: Returns all field names in an array.
fields() - Method in class org.apache.spark.sql.types.StructType
FIFO() - Static method in class org.apache.spark.scheduler.SchedulingMode
files() - Method in class org.apache.spark.SparkContext
fileStream(String, Class<K>, Class<V>, Class<F>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream that monitors a Hadoop-compatible filesystem for new files and reads them using the given key-value types and input format.
fileStream(String, Class<K>, Class<V>, Class<F>, Function<Path, Boolean>, boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream that monitors a Hadoop-compatible filesystem for new files and reads them using the given key-value types and input format.
fileStream(String, Class<K>, Class<V>, Class<F>, Function<Path, Boolean>, boolean, Configuration) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream that monitors a Hadoop-compatible filesystem for new files and reads them using the given key-value types and input format.
fileStream(String, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.streaming.StreamingContext: Create a input stream that monitors a Hadoop-compatible filesystem for new files and reads them using the given key-value types and input format.
fileStream(String, Function1<Path, Object>, boolean, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.streaming.StreamingContext: Create a input stream that monitors a Hadoop-compatible filesystem for new files and reads them using the given key-value types and input format.
fileStream(String, Function1<Path, Object>, boolean, Configuration, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.streaming.StreamingContext: Create a input stream that monitors a Hadoop-compatible filesystem for new files and reads them using the given key-value types and input format.
fill(double) - Method in class org.apache.spark.sql.DataFrameNaFunctions: Returns a new DataFrame that replaces null or NaN values in numeric columns with value.
fill(String) - Method in class org.apache.spark.sql.DataFrameNaFunctions: Returns a new DataFrame that replaces null values in string columns with value.
fill(double, String[]) - Method in class org.apache.spark.sql.DataFrameNaFunctions: Returns a new DataFrame that replaces null or NaN values in specified numeric columns.
fill(double, Seq<String>) - Method in class org.apache.spark.sql.DataFrameNaFunctions: (Scala-specific) Returns a new DataFrame that replaces null or NaN values in specified numeric columns.
fill(String, String[]) - Method in class org.apache.spark.sql.DataFrameNaFunctions: Returns a new DataFrame that replaces null values in specified string columns.
fill(String, Seq<String>) - Method in class org.apache.spark.sql.DataFrameNaFunctions: (Scala-specific) Returns a new DataFrame that replaces null values in specified string columns.
fill(Map<String, Object>) - Method in class org.apache.spark.sql.DataFrameNaFunctions: Returns a new DataFrame that replaces null values.
fill(Map<String, Object>) - Method in class org.apache.spark.sql.DataFrameNaFunctions: (Scala-specific) Returns a new DataFrame that replaces null values.
filter(Function<Double, Boolean>) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a new RDD containing only the elements that satisfy a predicate.
filter(Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a new RDD containing only the elements that satisfy a predicate.
filter(Function<T, Boolean>) - Method in class org.apache.spark.api.java.JavaRDD: Return a new RDD containing only the elements that satisfy a predicate.
filter(Function1<Graph<VD, ED>, Graph<VD2, ED2>>, Function1<EdgeTriplet<VD2, ED2>, Object>, Function2<Object, VD2, Object>, ClassTag<VD2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.GraphOps: Filter the graph by computing some values to filter on, and applying the predicates.
filter(Function1<EdgeTriplet<VD, ED>, Object>, Function2<Object, VD, Object>) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
filter(Function1<Tuple2<Object, VD>, Object>) - Method in class org.apache.spark.graphx.VertexRDD: Restricts the vertex set to the set of vertices satisfying the given predicate.
filter(Params) - Method in class org.apache.spark.ml.param.ParamMap: Filters this param map for the given parent.
filter(Function1<T, Object>) - Method in class org.apache.spark.rdd.RDD: Return a new RDD containing only the elements that satisfy a predicate.
filter(Column) - Method in class org.apache.spark.sql.DataFrame: Filters rows using the given condition.
filter(String) - Method in class org.apache.spark.sql.DataFrame: Filters rows using the given SQL expression.
filter(Function1<T, Object>) - Method in class org.apache.spark.sql.Dataset: (Scala-specific) Returns a new Dataset that only contains elements where func returns true.
filter(FilterFunction<T>) - Method in class org.apache.spark.sql.Dataset: (Java-specific) Returns a new Dataset that only contains elements where func returns true.
Filter - Class in org.apache.spark.sql.sources: A filter predicate for data sources.
Filter() - Constructor for class org.apache.spark.sql.sources.Filter
filter(Function<T, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaDStream: Return a new DStream containing only the elements that satisfy a predicate.
filter(Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream containing only the elements that satisfy a predicate.
filter(Function1<T, Object>) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream containing only the elements that satisfy a predicate.
filterByRange(K, K) - Method in class org.apache.spark.rdd.OrderedRDDFunctions: Returns an RDD containing only the elements in the the inclusive range lower to upper.
FilterFunction<T> - Interface in org.apache.spark.api.java.function: Base interface for a function used in Dataset's filter function.
filterWith(Function1<Object, A>, Function2<T, A, Object>) - Method in class org.apache.spark.rdd.RDD: Filters this RDD with p, where p takes an additional parameter of type A.
findSplitsBins(RDD<LabeledPoint>, org.apache.spark.mllib.tree.impl.DecisionTreeMetadata) - Static method in class org.apache.spark.mllib.tree.DecisionTree: Returns splits and bins for decision tree calculation.
findSynonyms(String, int) - Method in class org.apache.spark.ml.feature.Word2VecModel: Find "num" number of words closest in similarity to the given word.
findSynonyms(Vector, int) - Method in class org.apache.spark.ml.feature.Word2VecModel: Find "num" number of words closest to similarity to the given vector representation of the word.
findSynonyms(String, int) - Method in class org.apache.spark.mllib.feature.Word2VecModel
findSynonyms(Vector, int) - Method in class org.apache.spark.mllib.feature.Word2VecModel
finish(B) - Method in class org.apache.spark.sql.expressions.Aggregator: Transform the output of the reduction.
finished() - Method in class org.apache.spark.scheduler.TaskInfo
finishTime() - Method in class org.apache.spark.scheduler.TaskInfo: The time when the task has completed successfully (including the time to remotely fetch results, if necessary).
first() - Method in class org.apache.spark.api.java.JavaDoubleRDD
first() - Method in class org.apache.spark.api.java.JavaPairRDD
first() - Method in interface org.apache.spark.api.java.JavaRDDLike: Return the first element in this RDD.
first() - Method in class org.apache.spark.rdd.RDD: Return the first element in this RDD.
first() - Method in class org.apache.spark.sql.DataFrame: Returns the first row.
first() - Method in class org.apache.spark.sql.Dataset: Returns the first element in this Dataset.
first(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the first value in a group.
first(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the first value of a column in a group.
firstParent(ClassTag) - Method in class org.apache.spark.rdd.RDD: Returns the first parent RDD
fit(DataFrame) - Method in class org.apache.spark.ml.classification.OneVsRest
fit(DataFrame) - Method in class org.apache.spark.ml.clustering.KMeans
fit(DataFrame) - Method in class org.apache.spark.ml.clustering.LDA
fit(DataFrame, ParamPair<?>, ParamPair<?>...) - Method in class org.apache.spark.ml.Estimator: Fits a single model to the input data with optional parameters.
fit(DataFrame, ParamPair<?>, Seq<ParamPair<?>>) - Method in class org.apache.spark.ml.Estimator: Fits a single model to the input data with optional parameters.
fit(DataFrame, ParamMap) - Method in class org.apache.spark.ml.Estimator: Fits a single model to the input data with provided parameter map.
fit(DataFrame) - Method in class org.apache.spark.ml.Estimator: Fits a model to the input data.
fit(DataFrame, ParamMap[]) - Method in class org.apache.spark.ml.Estimator: Fits multiple models to the input data with multiple sets of parameters.
fit(DataFrame) - Method in class org.apache.spark.ml.feature.ChiSqSelector
fit(DataFrame) - Method in class org.apache.spark.ml.feature.CountVectorizer
fit(DataFrame) - Method in class org.apache.spark.ml.feature.IDF
fit(DataFrame) - Method in class org.apache.spark.ml.feature.MinMaxScaler
fit(DataFrame) - Method in class org.apache.spark.ml.feature.PCA: Computes a PCAModel that contains the principal components of the input vectors.
fit(DataFrame) - Method in class org.apache.spark.ml.feature.QuantileDiscretizer
fit(DataFrame) - Method in class org.apache.spark.ml.feature.RFormula
fit(DataFrame) - Method in class org.apache.spark.ml.feature.StandardScaler
fit(DataFrame) - Method in class org.apache.spark.ml.feature.StringIndexer
fit(DataFrame) - Method in class org.apache.spark.ml.feature.VectorIndexer
fit(DataFrame) - Method in class org.apache.spark.ml.feature.Word2Vec
fit(DataFrame) - Method in class org.apache.spark.ml.Pipeline: Fits the pipeline to the input dataset with additional parameters.
fit(DataFrame) - Method in class org.apache.spark.ml.Predictor
fit(DataFrame) - Method in class org.apache.spark.ml.recommendation.ALS
fit(DataFrame) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
fit(DataFrame) - Method in class org.apache.spark.ml.regression.IsotonicRegression
fit(DataFrame) - Method in class org.apache.spark.ml.tuning.CrossValidator
fit(DataFrame) - Method in class org.apache.spark.ml.tuning.TrainValidationSplit
fit(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.feature.ChiSqSelector
fit(RDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDF: Computes the inverse document frequency.
fit(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDF: Computes the inverse document frequency.
fit(RDD<Vector>) - Method in class org.apache.spark.mllib.feature.PCA: Computes a PCAModel that contains the principal components of the input vectors.
fit(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.feature.PCA: Java-friendly version of fit()
fit(RDD<Vector>) - Method in class org.apache.spark.mllib.feature.StandardScaler: Computes the mean and variance and stores as a model to be used for later scaling.
fit(RDD<S>) - Method in class org.apache.spark.mllib.feature.Word2Vec
fit(JavaRDD<S>) - Method in class org.apache.spark.mllib.feature.Word2Vec
flatMap(FlatMapFunction<T, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by first applying a function to all elements of this RDD, and then flattening the results.
flatMap(Function1<T, TraversableOnce>, ClassTag) - Method in class org.apache.spark.rdd.RDD: Return a new RDD by first applying a function to all elements of this RDD, and then flattening the results.
flatMap(Function1<Row, TraversableOnce<R>>, ClassTag<R>) - Method in class org.apache.spark.sql.DataFrame: Returns a new RDD by first applying a function to all rows of this DataFrame, and then flattening the results.
flatMap(Function1<T, TraversableOnce>, Encoder) - Method in class org.apache.spark.sql.Dataset: (Scala-specific) Returns a new Dataset by first applying a function to all elements of this Dataset, and then flattening the results.
flatMap(FlatMapFunction<T, U>, Encoder) - Method in class org.apache.spark.sql.Dataset: (Java-specific) Returns a new Dataset by first applying a function to all elements of this Dataset, and then flattening the results.
flatMap(FlatMapFunction<T, U>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream by applying a function to all elements of this DStream, and then flattening the results
flatMap(Function1<T, Traversable>, ClassTag) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream by applying a function to all elements of this DStream, and then flattening the results
FlatMapFunction<T,R> - Interface in org.apache.spark.api.java.function: A function that returns zero or more output records from each input record.
FlatMapFunction2<T1,T2,R> - Interface in org.apache.spark.api.java.function: A function that takes two inputs and returns zero or more output records.
flatMapGroups(Function2<K, Iterator<V>, TraversableOnce>, Encoder) - Method in class org.apache.spark.sql.GroupedDataset: Applies the given function to each group of data.
flatMapGroups(FlatMapGroupsFunction<K, V, U>, Encoder) - Method in class org.apache.spark.sql.GroupedDataset: Applies the given function to each group of data.
FlatMapGroupsFunction<K,V,R> - Interface in org.apache.spark.api.java.function: A function that returns zero or more output records from each grouping key and its values.
flatMapToDouble(DoubleFlatMapFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by first applying a function to all elements of this RDD, and then flattening the results.
flatMapToPair(PairFlatMapFunction<T, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by first applying a function to all elements of this RDD, and then flattening the results.
flatMapToPair(PairFlatMapFunction<T, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream by applying a function to all elements of this DStream, and then flattening the results
flatMapValues(Function<V, Iterable>) - Method in class org.apache.spark.api.java.JavaPairRDD: Pass each value in the key-value pair RDD through a flatMap function without changing the keys; this also retains the original RDD's partitioning.
flatMapValues(Function1<V, TraversableOnce>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Pass each value in the key-value pair RDD through a flatMap function without changing the keys; this also retains the original RDD's partitioning.
flatMapValues(Function<V, Iterable>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying a flatmap function to the value of each key-value pairs in 'this' DStream without changing the key.
flatMapValues(Function1<V, TraversableOnce>, ClassTag) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying a flatmap function to the value of each key-value pairs in 'this' DStream without changing the key.
flatMapWith(Function1<Object, A>, boolean, Function2<T, A, Seq>, ClassTag) - Method in class org.apache.spark.rdd.RDD: FlatMaps f over this RDD, where f takes an additional parameter of type A.
FLOAT() - Static method in class org.apache.spark.sql.Encoders: An encoder for nullable float type.
FloatDecimal() - Static method in class org.apache.spark.sql.types.DecimalType
FloatParam - Class in org.apache.spark.ml.param: :: DeveloperApi :: Specialized version of Param[Float] for Java.
FloatParam(String, String, String, Function1<Object, Object>) - Constructor for class org.apache.spark.ml.param.FloatParam
FloatParam(String, String, String) - Constructor for class org.apache.spark.ml.param.FloatParam
FloatParam(Identifiable, String, String, Function1<Object, Object>) - Constructor for class org.apache.spark.ml.param.FloatParam
FloatParam(Identifiable, String, String) - Constructor for class org.apache.spark.ml.param.FloatParam
floatToFloatWritable(float) - Static method in class org.apache.spark.SparkContext
FloatType - Static variable in class org.apache.spark.sql.types.DataTypes: Gets the FloatType object.
FloatType - Class in org.apache.spark.sql.types: :: DeveloperApi :: The data type representing Float values.
floatWritableConverter() - Static method in class org.apache.spark.SparkContext
floor(Column) - Static method in class org.apache.spark.sql.functions: Computes the floor of the given value.
floor(String) - Static method in class org.apache.spark.sql.functions: Computes the floor of the given column.
floor() - Method in class org.apache.spark.sql.types.Decimal
floor(Duration) - Method in class org.apache.spark.streaming.Time
floor(Duration, Time) - Method in class org.apache.spark.streaming.Time
FlumeUtils - Class in org.apache.spark.streaming.flume
FlumeUtils() - Constructor for class org.apache.spark.streaming.flume.FlumeUtils
flush() - Method in class org.apache.spark.io.SnappyOutputStreamWrapper
flush() - Method in class org.apache.spark.serializer.SerializationStream
flush() - Method in class org.apache.spark.storage.TimeTrackingOutputStream
fMeasure(double, double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns f-measure for a given label (category)
fMeasure(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns f1-measure for a given label (category)
fMeasure() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns f-measure (equals to precision and recall because precision equals recall)
fMeasureByThreshold() - Method in class org.apache.spark.ml.classification.BinaryLogisticRegressionSummary: Returns a dataframe with two fields (threshold, F-Measure) curve with beta = 1.0.
fMeasureByThreshold(double) - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Returns the (threshold, F-Measure) curve.
fMeasureByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Returns the (threshold, F-Measure) curve with beta = 1.0.
fold(T, Function2<T, T, T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Aggregate the elements of each partition, and then the results for all the partitions, using a given associative and commutative function and a neutral "zero value".
fold(T, Function2<T, T, T>) - Method in class org.apache.spark.rdd.RDD: Aggregate the elements of each partition, and then the results for all the partitions, using a given associative and commutative function and a neutral "zero value".
foldByKey(V, Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD: Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g ., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foldByKey(V, int, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD: Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g ., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foldByKey(V, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD: Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foldByKey(V, Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foldByKey(V, int, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foldByKey(V, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Merge the values for each key using an associative function and a neutral "zero value" which may be added to the result an arbitrary number of times, and must not change the result (e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
foreach(VoidFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Applies a function f to all elements of this RDD.
foreach(Function1<T, BoxedUnit>) - Method in class org.apache.spark.rdd.RDD: Applies a function f to all elements of this RDD.
foreach(Function1<Row, BoxedUnit>) - Method in class org.apache.spark.sql.DataFrame: Applies a function f to all rows.
foreach(Function1<T, BoxedUnit>) - Method in class org.apache.spark.sql.Dataset: (Scala-specific) Runs func on each element of this Dataset.
foreach(ForeachFunction<T>) - Method in class org.apache.spark.sql.Dataset: (Java-specific) Runs func on each element of this Dataset.
foreach(Function<R, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Deprecated.
As of release 0.9.0, replaced by foreachRDD
foreach(Function2<R, Time, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Deprecated.
As of release 0.9.0, replaced by foreachRDD
foreach(Function1<RDD<T>, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream: Deprecated.
As of 0.9.0, replaced by foreachRDD.
foreach(Function2<RDD<T>, Time, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream: Deprecated.
As of 0.9.0, replaced by foreachRDD.
foreachActive(Function2<Object, Object, BoxedUnit>) - Method in class org.apache.spark.mllib.linalg.DenseVector
foreachActive(Function3<Object, Object, Object, BoxedUnit>) - Method in interface org.apache.spark.mllib.linalg.Matrix: Applies a function f to all the active elements of dense and sparse matrix.
foreachActive(Function2<Object, Object, BoxedUnit>) - Method in class org.apache.spark.mllib.linalg.SparseVector
foreachActive(Function2<Object, Object, BoxedUnit>) - Method in interface org.apache.spark.mllib.linalg.Vector: Applies a function f to all the active elements of dense and sparse vector.
foreachAsync(VoidFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: The asynchronous version of the foreach action, which applies a function f to all the elements of this RDD.
foreachAsync(Function1<T, BoxedUnit>) - Method in class org.apache.spark.rdd.AsyncRDDActions: Applies a function f to all elements of this RDD.
ForeachFunction<T> - Interface in org.apache.spark.api.java.function: Base interface for a function used in Dataset's foreach function.
foreachPartition(VoidFunction<Iterator<T>>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Applies a function f to each partition of this RDD.
foreachPartition(Function1<Iterator<T>, BoxedUnit>) - Method in class org.apache.spark.rdd.RDD: Applies a function f to each partition of this RDD.
foreachPartition(Function1<Iterator<Row>, BoxedUnit>) - Method in class org.apache.spark.sql.DataFrame: Applies a function f to each partition of this DataFrame.
foreachPartition(Function1<Iterator<T>, BoxedUnit>) - Method in class org.apache.spark.sql.Dataset: (Scala-specific) Runs func on each partition of this Dataset.
foreachPartition(ForeachPartitionFunction<T>) - Method in class org.apache.spark.sql.Dataset: (Java-specific) Runs func on each partition of this Dataset.
foreachPartitionAsync(VoidFunction<Iterator<T>>) - Method in interface org.apache.spark.api.java.JavaRDDLike: The asynchronous version of the foreachPartition action, which applies a function f to each partition of this RDD.
foreachPartitionAsync(Function1<Iterator<T>, BoxedUnit>) - Method in class org.apache.spark.rdd.AsyncRDDActions: Applies a function f to each partition of this RDD.
ForeachPartitionFunction<T> - Interface in org.apache.spark.api.java.function: Base interface for a function used in Dataset's foreachPartition function.
foreachRDD(Function<R, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Deprecated.
As of release 1.6.0, replaced by foreachRDD(JVoidFunction)
foreachRDD(Function2<R, Time, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Deprecated.
As of release 1.6.0, replaced by foreachRDD(JVoidFunction2)
foreachRDD(VoidFunction<R>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Apply a function to each RDD in this DStream.
foreachRDD(VoidFunction2<R, Time>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Apply a function to each RDD in this DStream.
foreachRDD(Function1<RDD<T>, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream: Apply a function to each RDD in this DStream.
foreachRDD(Function2<RDD<T>, Time, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream: Apply a function to each RDD in this DStream.
foreachWith(Function1<Object, A>, Function2<T, A, BoxedUnit>) - Method in class org.apache.spark.rdd.RDD: Applies f to each element of this RDD, where f takes an additional parameter of type A.
format(String) - Method in class org.apache.spark.sql.DataFrameReader: Specifies the input data source format.
format(String) - Method in class org.apache.spark.sql.DataFrameWriter: Specifies the underlying output data source.
format_number(Column, int) - Static method in class org.apache.spark.sql.functions: Formats numeric column x to a format like '#,###,###.##', rounded to d decimal places, and returns the result as a string column.
format_string(String, Column...) - Static method in class org.apache.spark.sql.functions: Formats the arguments in printf-style and returns the result as a string column.
format_string(String, Seq<Column>) - Static method in class org.apache.spark.sql.functions: Formats the arguments in printf-style and returns the result as a string column.
formatVersion() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
formatVersion() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
formatVersion() - Method in class org.apache.spark.mllib.classification.SVMModel
formatVersion() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
formatVersion() - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
formatVersion() - Method in class org.apache.spark.mllib.clustering.KMeansModel
formatVersion() - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
formatVersion() - Method in class org.apache.spark.mllib.clustering.PowerIterationClusteringModel
formatVersion() - Method in class org.apache.spark.mllib.feature.ChiSqSelectorModel
formatVersion() - Method in class org.apache.spark.mllib.feature.Word2VecModel
formatVersion() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
formatVersion() - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
formatVersion() - Method in class org.apache.spark.mllib.regression.LassoModel
formatVersion() - Method in class org.apache.spark.mllib.regression.LinearRegressionModel
formatVersion() - Method in class org.apache.spark.mllib.regression.RidgeRegressionModel
formatVersion() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
formatVersion() - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
formatVersion() - Method in class org.apache.spark.mllib.tree.model.RandomForestModel
formatVersion() - Method in interface org.apache.spark.mllib.util.Saveable: Current version of model save/load format.
formula() - Method in class org.apache.spark.ml.feature.RFormula: R formula parameter.
FPGrowth - Class in org.apache.spark.mllib.fpm: A parallel FP-growth algorithm to mine frequent itemsets.
FPGrowth() - Constructor for class org.apache.spark.mllib.fpm.FPGrowth: Constructs a default instance with default parameters {minSupport: 0.3, numPartitions: same as the input data}.
FPGrowth.FreqItemset<Item> - Class in org.apache.spark.mllib.fpm: Frequent itemset.
FPGrowth.FreqItemset(Object, long) - Constructor for class org.apache.spark.mllib.fpm.FPGrowth.FreqItemset
FPGrowthModel<Item> - Class in org.apache.spark.mllib.fpm: Model trained by FPGrowth, which holds frequent itemsets.
FPGrowthModel(RDD<FPGrowth.FreqItemset<Item>>, ClassTag<Item>) - Constructor for class org.apache.spark.mllib.fpm.FPGrowthModel
fractional() - Method in class org.apache.spark.sql.types.DecimalType
fractional() - Method in class org.apache.spark.sql.types.DoubleType
fractional() - Method in class org.apache.spark.sql.types.FloatType
freq() - Method in class org.apache.spark.mllib.fpm.FPGrowth.FreqItemset
freq() - Method in class org.apache.spark.mllib.fpm.PrefixSpan.FreqSequence
freqItems(String[], double) - Method in class org.apache.spark.sql.DataFrameStatFunctions: Finding frequent items for columns, possibly with false positives.
freqItems(String[]) - Method in class org.apache.spark.sql.DataFrameStatFunctions: Finding frequent items for columns, possibly with false positives.
freqItems(Seq<String>, double) - Method in class org.apache.spark.sql.DataFrameStatFunctions: (Scala-specific) Finding frequent items for columns, possibly with false positives.
freqItems(Seq<String>) - Method in class org.apache.spark.sql.DataFrameStatFunctions: (Scala-specific) Finding frequent items for columns, possibly with false positives.
freqItemsets() - Method in class org.apache.spark.mllib.fpm.FPGrowthModel
freqSequences() - Method in class org.apache.spark.mllib.fpm.PrefixSpanModel
from_unixtime(Column) - Static method in class org.apache.spark.sql.functions: Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a string representing the timestamp of that moment in the current system time zone in the given format.
from_unixtime(Column, String) - Static method in class org.apache.spark.sql.functions: Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a string representing the timestamp of that moment in the current system time zone in the given format.
from_utc_timestamp(Column, String) - Static method in class org.apache.spark.sql.functions: Assumes given timestamp is UTC and converts to given timezone.
fromAttributes(Seq<Attribute>) - Static method in class org.apache.spark.sql.types.StructType
fromAvroFlumeEvent(AvroFlumeEvent) - Static method in class org.apache.spark.streaming.flume.SparkFlumeEvent
fromCaseClassString(String) - Static method in class org.apache.spark.sql.types.DataType: Deprecated.
As of 1.2.0, replaced by DataType.fromJson()
fromCOO(int, int, Iterable<Tuple3<Object, Object, Object>>) - Static method in class org.apache.spark.mllib.linalg.SparseMatrix: Generate a SparseMatrix from Coordinate List (COO) format.
fromDStream(DStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaDStream: Convert a scala DStream to a Java-friendly JavaDStream.
fromEdgePartitions(RDD<Tuple2<Object, EdgePartition<ED, VD>>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl: Create a graph from EdgePartitions, setting referenced vertices to `defaultVertexAttr`.
fromEdges(RDD<Edge<ED>>, ClassTag<ED>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.EdgeRDD: Creates an EdgeRDD from a set of edges.
fromEdges(RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.Graph: Construct a graph from a collection of edges.
fromEdges(EdgeRDD<?>, int, VD, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD: Constructs a VertexRDD containing all vertices referred to in edges.
fromEdgeTuples(RDD<Tuple2<Object, Object>>, VD, Option<PartitionStrategy>, StorageLevel, StorageLevel, ClassTag<VD>) - Static method in class org.apache.spark.graphx.Graph: Construct a graph from a collection of edges encoded as vertex id pairs.
fromExistingRDDs(VertexRDD<VD>, EdgeRDD<ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl: Create a graph from a VertexRDD and an EdgeRDD with the same replicated vertex type as the vertices.
fromInputDStream(InputDStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaInputDStream: Convert a scala InputDStream to a Java-friendly JavaInputDStream.
fromInputDStream(InputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairInputDStream: Convert a scala InputDStream of pairs to a Java-friendly JavaPairInputDStream.
fromJavaDStream(JavaDStream<Tuple2<K, V>>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
fromJavaRDD(JavaRDD<Tuple2<K, V>>) - Static method in class org.apache.spark.api.java.JavaPairRDD: Convert a JavaRDD of key-value pairs to JavaPairRDD.
fromJson(String) - Static method in class org.apache.spark.mllib.linalg.Vectors: Parses the JSON representation of a vector into a Vector.
fromJson(String) - Static method in class org.apache.spark.sql.types.DataType
fromJson(String) - Static method in class org.apache.spark.sql.types.Metadata: Creates a Metadata instance from JSON.
fromName(String) - Static method in class org.apache.spark.ml.attribute.AttributeType: Gets the AttributeType object from its name.
fromOffset() - Method in class org.apache.spark.streaming.kafka.OffsetRange
fromOld(DecisionTreeModel, DecisionTreeClassifier, Map<Object, Object>, int) - Static method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel: (private[ml]) Convert a model from the old API
fromOld(GradientBoostedTreesModel, GBTClassifier, Map<Object, Object>, int) - Static method in class org.apache.spark.ml.classification.GBTClassificationModel: (private[ml]) Convert a model from the old API
fromOld(RandomForestModel, RandomForestClassifier, Map<Object, Object>, int, int) - Static method in class org.apache.spark.ml.classification.RandomForestClassificationModel: (private[ml]) Convert a model from the old API
fromOld(DecisionTreeModel, DecisionTreeRegressor, Map<Object, Object>, int) - Static method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel: (private[ml]) Convert a model from the old API
fromOld(GradientBoostedTreesModel, GBTRegressor, Map<Object, Object>, int) - Static method in class org.apache.spark.ml.regression.GBTRegressionModel: (private[ml]) Convert a model from the old API
fromOld(RandomForestModel, RandomForestRegressor, Map<Object, Object>, int) - Static method in class org.apache.spark.ml.regression.RandomForestRegressionModel: (private[ml]) Convert a model from the old API
fromOld(Node, Map<Object, Object>) - Static method in class org.apache.spark.ml.tree.Node: Create a new Node from the old Node format, recursively creating child nodes as needed.
fromPairDStream(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
fromPairRDD(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.mllib.rdd.MLPairRDDFunctions: Implicit conversion from a pair RDD to MLPairRDDFunctions.
fromRDD(RDD<Object>) - Static method in class org.apache.spark.api.java.JavaDoubleRDD
fromRDD(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.api.java.JavaPairRDD
fromRDD(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.api.java.JavaRDD
fromRDD(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.mllib.rdd.RDDFunctions: Implicit conversion from an RDD to RDDFunctions.
fromRdd(RDD<?>) - Static method in class org.apache.spark.storage.RDDInfo
fromReceiverInputDStream(ReceiverInputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream: Convert a scala ReceiverInputDStream to a Java-friendly JavaReceiverInputDStream.
fromReceiverInputDStream(ReceiverInputDStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream: Convert a scala ReceiverInputDStream to a Java-friendly JavaReceiverInputDStream.
fromSparkContext(SparkContext) - Static method in class org.apache.spark.api.java.JavaSparkContext
fromStage(Stage, int, Option<Object>, Seq<Seq<TaskLocation>>) - Static method in class org.apache.spark.scheduler.StageInfo: Construct a StageInfo from a Stage.
fromString(String) - Static method in enum org.apache.spark.JobExecutionStatus
fromString(String) - Static method in class org.apache.spark.mllib.tree.loss.Losses
fromString(String) - Static method in enum org.apache.spark.status.api.v1.ApplicationStatus
fromString(String) - Static method in enum org.apache.spark.status.api.v1.StageStatus
fromString(String) - Static method in enum org.apache.spark.status.api.v1.TaskSorting
fromString(String) - Static method in class org.apache.spark.storage.StorageLevel: :: DeveloperApi :: Return the StorageLevel object with the specified name.
fromStructField(StructField) - Static method in class org.apache.spark.ml.attribute.AttributeGroup: Creates an attribute group from a StructField instance.
fullOuterJoin(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Perform a full outer join of this and other.
fullOuterJoin(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD: Perform a full outer join of this and other.
fullOuterJoin(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Perform a full outer join of this and other.
fullOuterJoin(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions: Perform a full outer join of this and other.
fullOuterJoin(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Perform a full outer join of this and other.
fullOuterJoin(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: Perform a full outer join of this and other.
fullOuterJoin(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'full outer join' between RDDs of this DStream and other DStream.
fullOuterJoin(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'full outer join' between RDDs of this DStream and other DStream.
fullOuterJoin(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'full outer join' between RDDs of this DStream and other DStream.
fullOuterJoin(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'full outer join' between RDDs of this DStream and other DStream.
fullOuterJoin(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'full outer join' between RDDs of this DStream and other DStream.
fullOuterJoin(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'full outer join' between RDDs of this DStream and other DStream.
fullStackTrace() - Method in class org.apache.spark.ExceptionFailure
Function<T1,R> - Interface in org.apache.spark.api.java.function: Base interface for functions whose return types do not create special RDDs.
function(Function4<Time, KeyType, Option<ValueType>, State<StateType>, Option<MappedType>>) - Static method in class org.apache.spark.streaming.StateSpec: Create a StateSpec for setting all the specifications of the mapWithState operation on a pair DStream.
function(Function3<KeyType, Option<ValueType>, State<StateType>, MappedType>) - Static method in class org.apache.spark.streaming.StateSpec: Create a StateSpec for setting all the specifications of the mapWithState operation on a pair DStream.
function(Function4<Time, KeyType, Optional<ValueType>, State<StateType>, Optional<MappedType>>) - Static method in class org.apache.spark.streaming.StateSpec: Create a StateSpec for setting all the specifications of the mapWithState operation on a JavaPairDStream.
function(Function3<KeyType, Optional<ValueType>, State<StateType>, MappedType>) - Static method in class org.apache.spark.streaming.StateSpec: Create a StateSpec for setting all the specifications of the mapWithState operation on a JavaPairDStream.
Function0<R> - Interface in org.apache.spark.api.java.function: A zero-argument function that returns an R.
Function2<T1,T2,R> - Interface in org.apache.spark.api.java.function: A two-argument function that takes arguments of type T1 and T2 and returns an R.
Function3<T1,T2,T3,R> - Interface in org.apache.spark.api.java.function: A three-argument function that takes arguments of type T1, T2 and T3 and returns an R.
Function4<T1,T2,T3,T4,R> - Interface in org.apache.spark.api.java.function: A four-argument function that takes arguments of type T1, T2, T3 and T4 and returns an R.
functionRegistry() - Method in class org.apache.spark.sql.hive.HiveContext
functionRegistry() - Method in class org.apache.spark.sql.SQLContext
functions - Class in org.apache.spark.sql
functions() - Constructor for class org.apache.spark.sql.functions
FutureAction<T> - Interface in org.apache.spark: A future for the result of an action to support cancellation.
futureExecutionContext() - Static method in class org.apache.spark.rdd.AsyncRDDActions

G

gain() - Method in class org.apache.spark.ml.tree.InternalNode
gain() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
gamma1() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
gamma2() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
gamma6() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
gamma7() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
GammaGenerator - Class in org.apache.spark.mllib.random: :: DeveloperApi :: Generates i.i.d.
GammaGenerator(double, double) - Constructor for class org.apache.spark.mllib.random.GammaGenerator
gammaJavaRDD(JavaSparkContext, double, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Java-friendly version of RandomRDDs.gammaRDD(org.apache.spark.SparkContext, double, double, long, int, long).
gammaJavaRDD(JavaSparkContext, double, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.gammaJavaRDD(org.apache.spark.api.java.JavaSparkContext, double, double, long, int, long) with the default seed.
gammaJavaRDD(JavaSparkContext, double, double, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.gammaJavaRDD(org.apache.spark.api.java.JavaSparkContext, double, double, long, int, long) with the default number of partitions and the default seed.
gammaJavaVectorRDD(JavaSparkContext, double, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Java-friendly version of RandomRDDs.gammaVectorRDD(org.apache.spark.SparkContext, double, double, long, int, int, long).
gammaJavaVectorRDD(JavaSparkContext, double, double, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.gammaJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, double, double, long, int, int, long) with the default seed.
gammaJavaVectorRDD(JavaSparkContext, double, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.gammaJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, double, double, long, int, int, long) with the default number of partitions and the default seed.
gammaRDD(SparkContext, double, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Generates an RDD comprised of i.i.d. samples from the gamma distribution with the input shape and scale.
gammaShape() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
gammaShape() - Method in class org.apache.spark.mllib.clustering.LDAModel: Shape parameter for random initialization of variational parameter gamma.
gammaShape() - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
gammaVectorRDD(SparkContext, double, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Generates an RDD[Vector] with vectors containing i.i.d. samples drawn from the gamma distribution with the input shape and scale.
gaps() - Method in class org.apache.spark.ml.feature.RegexTokenizer: Indicates whether regex splits on gaps (true) or matches tokens (false).
GaussianMixture - Class in org.apache.spark.mllib.clustering: This class performs expectation maximization for multivariate Gaussian Mixture Models (GMMs).
GaussianMixture() - Constructor for class org.apache.spark.mllib.clustering.GaussianMixture: Constructs a default instance.
GaussianMixtureModel - Class in org.apache.spark.mllib.clustering: Multivariate Gaussian Mixture Model (GMM) consisting of k Gaussians, where points are drawn from each Gaussian i=1..k with probability w(i); mu(i) and sigma(i) are the respective mean and covariance for each Gaussian distribution i=1..k.
GaussianMixtureModel(double[], MultivariateGaussian[]) - Constructor for class org.apache.spark.mllib.clustering.GaussianMixtureModel
gaussians() - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
GBTClassificationModel - Class in org.apache.spark.ml.classification: :: Experimental :: Gradient-Boosted Trees (GBTs) model for classification.
GBTClassificationModel(String, DecisionTreeRegressionModel[], double[]) - Constructor for class org.apache.spark.ml.classification.GBTClassificationModel: Construct a GBTClassificationModel
GBTClassifier - Class in org.apache.spark.ml.classification: :: Experimental :: Gradient-Boosted Trees (GBTs) learning algorithm for classification.
GBTClassifier(String) - Constructor for class org.apache.spark.ml.classification.GBTClassifier
GBTClassifier() - Constructor for class org.apache.spark.ml.classification.GBTClassifier
GBTRegressionModel - Class in org.apache.spark.ml.regression: :: Experimental ::
GBTRegressionModel(String, DecisionTreeRegressionModel[], double[]) - Constructor for class org.apache.spark.ml.regression.GBTRegressionModel: Construct a GBTRegressionModel
GBTRegressor - Class in org.apache.spark.ml.regression: :: Experimental :: Gradient-Boosted Trees (GBTs) learning algorithm for regression.
GBTRegressor(String) - Constructor for class org.apache.spark.ml.regression.GBTRegressor
GBTRegressor() - Constructor for class org.apache.spark.ml.regression.GBTRegressor
GeneralizedLinearAlgorithm<M extends GeneralizedLinearModel> - Class in org.apache.spark.mllib.regression: :: DeveloperApi :: GeneralizedLinearAlgorithm implements methods to train a Generalized Linear Model (GLM).
GeneralizedLinearAlgorithm() - Constructor for class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
GeneralizedLinearModel - Class in org.apache.spark.mllib.regression: :: DeveloperApi :: GeneralizedLinearModel (GLM) represents a model trained using GeneralizedLinearAlgorithm.
GeneralizedLinearModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.GeneralizedLinearModel
generateAssociationRules(double) - Method in class org.apache.spark.mllib.fpm.FPGrowthModel: Generates association rules for the Items in freqItemsets.
generatedRDDs() - Method in class org.apache.spark.streaming.dstream.DStream
generateKMeansRDD(SparkContext, int, int, int, double, int) - Static method in class org.apache.spark.mllib.util.KMeansDataGenerator: Generate an RDD containing test data for KMeans.
generateLinearInput(double, double[], int, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator: For compatibility, the generated data without specifying the mean and variance will have zero mean and variance of (1.0/3.0) since the original output range is [-1, 1] with uniform distribution, and the variance of uniform distribution is (b - a)^2^ / 12 which will be (1.0/3.0)
generateLinearInput(double, double[], double[], double[], int, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
generateLinearInput(double, double[], double[], double[], int, int, double, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
generateLinearInputAsList(double, double[], int, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator: Return a Java List of synthetic data randomly generated according to a multi collinear model.
generateLinearRDD(SparkContext, int, int, double, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator: Generate an RDD containing sample data for Linear Regression models - including Ridge, Lasso, and uregularized variants.
generateLogisticRDD(SparkContext, int, int, double, int, double) - Static method in class org.apache.spark.mllib.util.LogisticRegressionDataGenerator: Generate an RDD containing test data for LogisticRegression.
generateRandomEdges(int, int, int, long) - Static method in class org.apache.spark.graphx.util.GraphGenerators
geq(Object) - Method in class org.apache.spark.sql.Column: Greater than or equal to an expression.
get() - Method in interface org.apache.spark.FutureAction: Blocks and returns the result of this job.
get(Param<T>) - Method in class org.apache.spark.ml.param.ParamMap: Optionally returns the value associated with a param.
get(Param<T>) - Method in interface org.apache.spark.ml.param.Params
get(String) - Method in class org.apache.spark.SparkConf: Get a parameter; throws a NoSuchElementException if it's not set
get(String, String) - Method in class org.apache.spark.SparkConf: Get a parameter, falling back to a default if not set
get() - Static method in class org.apache.spark.SparkEnv: Returns the SparkEnv.
get(String) - Static method in class org.apache.spark.SparkFiles: Get the absolute path of a file added through SparkContext.addFile().
get(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i.
get() - Method in class org.apache.spark.streaming.State: Get the state if it exists, otherwise it will throw java.util.NoSuchElementException.
get() - Static method in class org.apache.spark.TaskContext: Return the currently active TaskContext.
get_json_object(Column, String) - Static method in class org.apache.spark.sql.functions: Extracts json object from a json string based on json path specified, and returns json string of the extracted json object.
getActive() - Static method in class org.apache.spark.streaming.StreamingContext: :: Experimental ::
getActiveJobIds() - Method in class org.apache.spark.api.java.JavaSparkStatusTracker: Returns an array containing the ids of all active jobs.
getActiveJobIds() - Method in class org.apache.spark.SparkStatusTracker: Returns an array containing the ids of all active jobs.
getActiveOrCreate(Function0<StreamingContext>) - Static method in class org.apache.spark.streaming.StreamingContext: :: Experimental ::
getActiveOrCreate(String, Function0<StreamingContext>, Configuration, boolean) - Static method in class org.apache.spark.streaming.StreamingContext: :: Experimental ::
getActiveStageIds() - Method in class org.apache.spark.api.java.JavaSparkStatusTracker: Returns an array containing the ids of all active stages.
getActiveStageIds() - Method in class org.apache.spark.SparkStatusTracker: Returns an array containing the ids of all active stages.
getAkkaConf() - Method in class org.apache.spark.SparkConf: Get all akka conf variables set on this SparkConf
getAlgo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
getAll() - Method in class org.apache.spark.SparkConf: Get all parameters as a list of pairs
getAllConfs() - Method in class org.apache.spark.sql.SQLContext: Return all the configuration properties that have been set (i.e.
getAllPools() - Method in class org.apache.spark.SparkContext: :: DeveloperApi :: Return pools for fair scheduler
getAlpha() - Method in class org.apache.spark.mllib.clustering.LDA: Alias for getDocConcentration
getAnyValAs(int) - Method in interface org.apache.spark.sql.Row: Returns the value of a given fieldName.
getAppId() - Method in interface org.apache.spark.launcher.SparkAppHandle: Returns the application ID, or null if not yet known.
getAppId() - Method in class org.apache.spark.SparkConf: Returns the Spark application id, valid in the Driver after TaskScheduler registration and from the start in the Executor.
getAs(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i.
getAs(String) - Method in interface org.apache.spark.sql.Row: Returns the value of a given fieldName.
getAsymmetricAlpha() - Method in class org.apache.spark.mllib.clustering.LDA: Alias for getAsymmetricDocConcentration
getAsymmetricDocConcentration() - Method in class org.apache.spark.mllib.clustering.LDA: Concentration parameter (commonly named "alpha") for the prior placed on documents' distributions over topics ("theta").
getAttr(String) - Method in class org.apache.spark.ml.attribute.AttributeGroup: Gets an attribute by its name.
getAttr(int) - Method in class org.apache.spark.ml.attribute.AttributeGroup: Gets an attribute by its index.
getAvroSchema() - Method in class org.apache.spark.SparkConf: Gets all the avro schemas in the configuration used in the generic Avro record serializer
getBeta() - Method in class org.apache.spark.mllib.clustering.LDA: Alias for getTopicConcentration
getBlock(BlockId) - Method in class org.apache.spark.storage.StorageStatus: Return the given block stored in this block manager in O(1) time.
getBoolean(String, boolean) - Method in class org.apache.spark.SparkConf: Get a parameter as a boolean, falling back to a default if not set
getBoolean(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i as a primitive boolean.
getBoolean(String) - Method in class org.apache.spark.sql.types.Metadata: Gets a Boolean.
getBooleanArray(String) - Method in class org.apache.spark.sql.types.Metadata: Gets a Boolean array.
getByte(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i as a primitive byte.
getCachedBlockManagerId(BlockManagerId) - Static method in class org.apache.spark.storage.BlockManagerId
getCachedMetadata(String) - Static method in class org.apache.spark.rdd.HadoopRDD: The three methods below are helpers for accessing the local map, a property of the SparkEnv of the local process.
getCaseSensitive() - Method in class org.apache.spark.ml.feature.StopWordsRemover
getCatalystType(int, String, int, MetadataBuilder) - Method in class org.apache.spark.sql.jdbc.AggregatedDialect
getCatalystType(int, String, int, MetadataBuilder) - Static method in class org.apache.spark.sql.jdbc.DerbyDialect
getCatalystType(int, String, int, MetadataBuilder) - Method in class org.apache.spark.sql.jdbc.JdbcDialect: Get the custom datatype mapping for the given jdbc meta information.
getCatalystType(int, String, int, MetadataBuilder) - Static method in class org.apache.spark.sql.jdbc.MsSqlServerDialect
getCatalystType(int, String, int, MetadataBuilder) - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
getCatalystType(int, String, int, MetadataBuilder) - Static method in class org.apache.spark.sql.jdbc.OracleDialect
getCatalystType(int, String, int, MetadataBuilder) - Static method in class org.apache.spark.sql.jdbc.PostgresDialect
getCategoricalFeaturesInfo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
getCheckpointDir() - Method in class org.apache.spark.api.java.JavaSparkContext
getCheckpointDir() - Method in class org.apache.spark.SparkContext
getCheckpointFile() - Method in interface org.apache.spark.api.java.JavaRDDLike: Gets the name of the file to which this RDD was checkpointed
getCheckpointFile() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
getCheckpointFile() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
getCheckpointFile() - Method in class org.apache.spark.rdd.RDD: Gets the name of the directory to which this RDD was checkpointed.
getCheckpointFiles() - Method in class org.apache.spark.graphx.Graph: Gets the name of the files to which this Graph was checkpointed.
getCheckpointFiles() - Method in class org.apache.spark.graphx.impl.GraphImpl
getCheckpointInterval() - Method in class org.apache.spark.mllib.clustering.LDA: Period (in iterations) between checkpoints.
getCheckpointInterval() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
getConf() - Method in class org.apache.spark.api.java.JavaSparkContext: Return a copy of this JavaSparkContext's configuration.
getConf() - Method in class org.apache.spark.rdd.HadoopRDD
getConf() - Method in class org.apache.spark.rdd.NewHadoopRDD
getConf() - Method in class org.apache.spark.SparkContext: Return a copy of this SparkContext's configuration.
getConf(String) - Method in class org.apache.spark.sql.SQLContext: Return the value of Spark SQL configuration property for the given key.
getConf(String, String) - Method in class org.apache.spark.sql.SQLContext: Return the value of Spark SQL configuration property for the given key.
getConnection() - Method in interface org.apache.spark.rdd.JdbcRDD.ConnectionFactory
getConvergenceTol() - Method in class org.apache.spark.mllib.clustering.GaussianMixture: Return the largest change in log-likelihood at which convergence is considered to have occurred.
getDate(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i of date type as java.sql.Date.
getDecimal(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i of decimal type as java.math.BigDecimal.
getDefault(Param<T>) - Method in interface org.apache.spark.ml.param.Params: Gets the default value of a parameter.
getDegree() - Method in class org.apache.spark.ml.feature.PolynomialExpansion
getDependencies() - Method in class org.apache.spark.rdd.CoGroupedRDD
getDependencies() - Method in class org.apache.spark.rdd.RDD: Implemented by subclasses to return how this RDD depends on parent RDDs.
getDependencies() - Method in class org.apache.spark.rdd.ShuffledRDD
getDependencies() - Method in class org.apache.spark.rdd.UnionRDD
getDeprecatedConfig(String, SparkConf) - Static method in class org.apache.spark.SparkConf: Looks for available deprecated keys for the given config option, and return the first value available.
getDocConcentration() - Method in class org.apache.spark.mllib.clustering.LDA: Concentration parameter (commonly named "alpha") for the prior placed on documents' distributions over topics ("theta").
getDouble(String, double) - Method in class org.apache.spark.SparkConf: Get a parameter as a double, falling back to a default if not set
getDouble(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i as a primitive double.
getDouble(String) - Method in class org.apache.spark.sql.types.Metadata: Gets a Double.
getDoubleArray(String) - Method in class org.apache.spark.sql.types.Metadata: Gets a Double array.
getEpsilon() - Method in class org.apache.spark.mllib.clustering.KMeans: The distance threshold within which we've consider centers to have converged.
getExecutorEnv() - Method in class org.apache.spark.SparkConf: Get all executor environment variables set on this SparkConf
getExecutorMemoryStatus() - Method in class org.apache.spark.SparkContext: Return a map from the slave to the max memory available for caching and the remaining memory available for caching.
getExecutorStorageStatus() - Method in class org.apache.spark.SparkContext: :: DeveloperApi :: Return information about blocks stored in all of the slaves
getField(String) - Method in class org.apache.spark.sql.Column: An expression that gets a field by name in a StructType.
getFinalValue() - Method in class org.apache.spark.partial.PartialResult: Blocking method to wait for and return the final value.
getFloat(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i as a primitive float.
getFormula() - Method in class org.apache.spark.ml.feature.RFormula
getGaps() - Method in class org.apache.spark.ml.feature.RegexTokenizer
getImpurity() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
getIndices() - Method in class org.apache.spark.ml.feature.VectorSlicer
getInitializationMode() - Method in class org.apache.spark.mllib.clustering.KMeans: The initialization algorithm.
getInitializationSteps() - Method in class org.apache.spark.mllib.clustering.KMeans: Number of steps for the k-means|| initialization mode
getInitialModel() - Method in class org.apache.spark.mllib.clustering.GaussianMixture: Return the user supplied initial GMM, if supplied
getInitialPositionInStream(int) - Method in class org.apache.spark.streaming.kinesis.KinesisUtilsPythonHelper
getInputFormat(JobConf) - Method in class org.apache.spark.rdd.HadoopRDD
getInt(String, int) - Method in class org.apache.spark.SparkConf: Get a parameter as an integer, falling back to a default if not set
getInt(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i as a primitive int.
getInverse() - Method in class org.apache.spark.ml.feature.DCT
getItem(Object) - Method in class org.apache.spark.sql.Column: An expression that gets an item at position ordinal out of an array, or gets a value by key key in a MapType.
getJavaMap(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i of array type as a Map.
getJDBCType(DataType) - Method in class org.apache.spark.sql.jdbc.AggregatedDialect
getJDBCType(DataType) - Static method in class org.apache.spark.sql.jdbc.DB2Dialect
getJDBCType(DataType) - Static method in class org.apache.spark.sql.jdbc.DerbyDialect
getJDBCType(DataType) - Method in class org.apache.spark.sql.jdbc.JdbcDialect: Retrieve the jdbc / sql type for a given datatype.
getJDBCType(DataType) - Static method in class org.apache.spark.sql.jdbc.MsSqlServerDialect
getJDBCType(DataType) - Static method in class org.apache.spark.sql.jdbc.PostgresDialect
getJobConf() - Method in class org.apache.spark.rdd.HadoopRDD
getJobIdsForGroup(String) - Method in class org.apache.spark.api.java.JavaSparkStatusTracker: Return a list of all known jobs in a particular job group.
getJobIdsForGroup(String) - Method in class org.apache.spark.SparkStatusTracker: Return a list of all known jobs in a particular job group.
getJobInfo(int) - Method in class org.apache.spark.api.java.JavaSparkStatusTracker: Returns job information, or null if the job info could not be found or was garbage collected.
getJobInfo(int) - Method in class org.apache.spark.SparkStatusTracker: Returns job information, or None if the job info could not be found or was garbage collected.
getK() - Method in class org.apache.spark.mllib.clustering.BisectingKMeans: Gets the desired number of leaf clusters.
getK() - Method in class org.apache.spark.mllib.clustering.GaussianMixture: Return the number of Gaussians in the mixture model
getK() - Method in class org.apache.spark.mllib.clustering.KMeans: Number of clusters to create (k).
getK() - Method in class org.apache.spark.mllib.clustering.LDA: Number of topics to infer.
getKappa() - Method in class org.apache.spark.mllib.clustering.OnlineLDAOptimizer: Learning rate: exponential decay rate
getLabels() - Method in class org.apache.spark.ml.feature.IndexToString
getLambda() - Method in class org.apache.spark.mllib.classification.NaiveBayes
getLDAModel(double[]) - Method in interface org.apache.spark.mllib.clustering.LDAOptimizer
getLearningRate() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
getLeastGroupHash(String) - Method in class org.apache.spark.rdd.PartitionCoalescer: Sorts and gets the least element of the list associated with key in groupHash The returned PartitionGroup is the least loaded of all groups that represent the machine "key"
getList(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i of array type as List.
getLocalProperty(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Get a local property set in this thread, or null if it is missing.
getLocalProperty(String) - Method in class org.apache.spark.SparkContext: Get a local property set in this thread, or null if it is missing.
getLong(String, long) - Method in class org.apache.spark.SparkConf: Get a parameter as a long, falling back to a default if not set
getLong(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i as a primitive long.
getLong(String) - Method in class org.apache.spark.sql.types.Metadata: Gets a Long.
getLongArray(String) - Method in class org.apache.spark.sql.types.Metadata: Gets a Long array.
getLoss() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
getLossType() - Method in class org.apache.spark.ml.classification.GBTClassifier
getLossType() - Method in class org.apache.spark.ml.regression.GBTRegressor
getMap(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i of map type as a Scala Map.
getMap() - Method in class org.apache.spark.sql.types.MetadataBuilder: Returns the immutable version of this map.
getMaxBins() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
getMaxDepth() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
getMaxIterations() - Method in class org.apache.spark.mllib.clustering.BisectingKMeans: Gets the max number of k-means iterations to split clusters.
getMaxIterations() - Method in class org.apache.spark.mllib.clustering.GaussianMixture: Return the maximum number of iterations to run
getMaxIterations() - Method in class org.apache.spark.mllib.clustering.KMeans: Maximum number of iterations to run.
getMaxIterations() - Method in class org.apache.spark.mllib.clustering.LDA: Maximum number of iterations for learning.
getMaxLocalProjDBSize() - Method in class org.apache.spark.mllib.fpm.PrefixSpan: Gets the maximum number of items allowed in a projected database before local processing.
getMaxMemoryInMB() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
getMaxPatternLength() - Method in class org.apache.spark.mllib.fpm.PrefixSpan: Gets the maximal pattern length (i.e.
getMessage() - Method in exception org.apache.spark.sql.AnalysisException
getMetadata(String) - Method in class org.apache.spark.sql.types.Metadata: Gets a Metadata.
getMetadataArray(String) - Method in class org.apache.spark.sql.types.Metadata: Gets a Metadata array.
getMetricName() - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
getMetricName() - Method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
getMetricName() - Method in class org.apache.spark.ml.evaluation.RegressionEvaluator
getMetricsSources(String) - Method in class org.apache.spark.TaskContext: ::DeveloperApi:: Returns all metrics sources with the given name which are associated with the instance which runs the task.
getMinDivisibleClusterSize() - Method in class org.apache.spark.mllib.clustering.BisectingKMeans: Gets the minimum number of points (if >= 1.0) or the minimum proportion of points (if < 1.0) of a divisible cluster.
getMiniBatchFraction() - Method in class org.apache.spark.mllib.clustering.OnlineLDAOptimizer: Mini-batch fraction, which sets the fraction of document sampled and used in each iteration
getMinInfoGain() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
getMinInstancesPerNode() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
getMinSupport() - Method in class org.apache.spark.mllib.fpm.PrefixSpan: Get the minimal support (i.e.
getMinTokenLength() - Method in class org.apache.spark.ml.feature.RegexTokenizer
getModel() - Method in class org.apache.spark.ml.clustering.DistributedLDAModel
getModel() - Method in class org.apache.spark.ml.clustering.LDAModel: Returns underlying spark.mllib model, which may be local or distributed
getModel() - Method in class org.apache.spark.ml.clustering.LocalLDAModel
getModelType() - Method in class org.apache.spark.mllib.classification.NaiveBayes
getN() - Method in class org.apache.spark.ml.feature.NGram
getNames() - Method in class org.apache.spark.ml.feature.VectorSlicer
getNode(int, Node) - Static method in class org.apache.spark.mllib.tree.model.Node: Traces down from a root node to get the node with the given node index.
getNumClasses() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
getNumFeatures() - Method in class org.apache.spark.ml.feature.HashingTF
getNumFeatures() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm: The dimension of training features.
getNumIterations() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
getNumPartitions() - Method in interface org.apache.spark.api.java.JavaRDDLike: Return the number of partitions in this RDD.
getNumPartitions() - Method in class org.apache.spark.rdd.RDD: Returns the number of partitions of this RDD.
getNumValues() - Method in class org.apache.spark.ml.attribute.NominalAttribute: Get the number of values, either from numValues or from values.
getOldDataset(DataFrame, String) - Static method in class org.apache.spark.ml.clustering.LDA: Get dataset for spark.mllib LDA
getOptimizeDocConcentration() - Method in class org.apache.spark.mllib.clustering.OnlineLDAOptimizer: Optimize docConcentration, indicates whether docConcentration (Dirichlet parameter for document-topic distribution) will be optimized during training.
getOptimizer() - Method in class org.apache.spark.mllib.clustering.LDA: :: DeveloperApi ::
getOption(String) - Method in class org.apache.spark.SparkConf: Get a parameter as an Option
getOption() - Method in class org.apache.spark.streaming.State: Get the state as an Option.
getOrCreate(SparkConf) - Static method in class org.apache.spark.SparkContext: This function may be used to get or instantiate a SparkContext and register it as a singleton object.
getOrCreate() - Static method in class org.apache.spark.SparkContext: This function may be used to get or instantiate a SparkContext and register it as a singleton object.
getOrCreate(SparkContext) - Static method in class org.apache.spark.sql.SQLContext: Get the singleton SQLContext if it exists or create a new one using the given SparkContext.
getOrCreate(String, JavaStreamingContextFactory) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Deprecated.
As of 1.4.0, replaced by getOrCreate without JavaStreamingContextFactor.
getOrCreate(String, Configuration, JavaStreamingContextFactory) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Deprecated.
As of 1.4.0, replaced by getOrCreate without JavaStreamingContextFactor.
getOrCreate(String, Configuration, JavaStreamingContextFactory, boolean) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Deprecated.
As of 1.4.0, replaced by getOrCreate without JavaStreamingContextFactor.
getOrCreate(String, Function0<JavaStreamingContext>) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
getOrCreate(String, Function0<JavaStreamingContext>, Configuration) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
getOrCreate(String, Function0<JavaStreamingContext>, Configuration, boolean) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
getOrCreate(String, Function0<StreamingContext>, Configuration, boolean) - Static method in class org.apache.spark.streaming.StreamingContext: Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
getOrDefault(Param<T>) - Method in interface org.apache.spark.ml.param.Params: Gets the value of a param in the embedded param map or its default value.
getOrElse(Param<T>, T) - Method in class org.apache.spark.ml.param.ParamMap: Returns the value associated with a param or a default value.
getP() - Method in class org.apache.spark.ml.feature.Normalizer
getParam(String) - Method in interface org.apache.spark.ml.param.Params
getParents(int) - Method in class org.apache.spark.NarrowDependency: Get the parent partitions for a child partition.
getParents(int) - Method in class org.apache.spark.OneToOneDependency
getParents(int) - Method in class org.apache.spark.RangeDependency
getPartition(long, long, int) - Method in class org.apache.spark.graphx.PartitionStrategy.CanonicalRandomVertexCut$
getPartition(long, long, int) - Method in class org.apache.spark.graphx.PartitionStrategy.EdgePartition1D$
getPartition(long, long, int) - Method in class org.apache.spark.graphx.PartitionStrategy.EdgePartition2D$
getPartition(long, long, int) - Method in interface org.apache.spark.graphx.PartitionStrategy: Returns the partition number for a given edge.
getPartition(long, long, int) - Method in class org.apache.spark.graphx.PartitionStrategy.RandomVertexCut$
getPartition(Object) - Method in class org.apache.spark.HashPartitioner
getPartition(Object) - Method in class org.apache.spark.Partitioner
getPartition(Object) - Method in class org.apache.spark.RangePartitioner
getPartitionId() - Static method in class org.apache.spark.TaskContext: Returns the partition id of currently active TaskContext.
getPartitions() - Method in class org.apache.spark.api.r.BaseRRDD
getPartitions() - Method in class org.apache.spark.graphx.EdgeRDD
getPartitions() - Method in class org.apache.spark.graphx.VertexRDD
getPartitions() - Method in class org.apache.spark.rdd.CoGroupedRDD
getPartitions() - Method in class org.apache.spark.rdd.HadoopRDD
getPartitions() - Method in class org.apache.spark.rdd.JdbcRDD
getPartitions() - Method in class org.apache.spark.rdd.NewHadoopRDD
getPartitions() - Method in class org.apache.spark.rdd.PartitionCoalescer
getPartitions() - Method in class org.apache.spark.rdd.PartitionPruningRDD
getPartitions() - Method in class org.apache.spark.rdd.RDD: Implemented by subclasses to return the set of partitions in this RDD.
getPartitions() - Method in class org.apache.spark.rdd.ShuffledRDD
getPartitions() - Method in class org.apache.spark.rdd.UnionRDD
getPath() - Method in class org.apache.spark.input.PortableDataStream
getPattern() - Method in class org.apache.spark.ml.feature.RegexTokenizer
getPersistentRDDs() - Method in class org.apache.spark.SparkContext: Returns an immutable map of RDDs that have marked themselves as persistent via cache() call.
getPoolForName(String) - Method in class org.apache.spark.SparkContext: :: DeveloperApi :: Return the pool associated with the given name, if one exists
getPreferredLocations(Partition) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.HadoopRDD
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.NewHadoopRDD
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.RDD: Optionally overridden by subclasses to specify placement preferences.
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.ShuffledRDD
getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.UnionRDD
getQuantileCalculationStrategy() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
getRDDStorageInfo() - Method in class org.apache.spark.SparkContext: :: DeveloperApi :: Return information about what RDDs are cached, if they are in mem or on disk, how much space they take, etc.
getReceiver() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream: Gets the receiver object that will be sent to the worker nodes to receive data.
getRootDirectory() - Static method in class org.apache.spark.SparkFiles: Get the root directory that contains files added through SparkContext.addFile().
getRuns() - Method in class org.apache.spark.mllib.clustering.KMeans: :: Experimental :: Number of runs of the algorithm to execute in parallel.
getScalingVec() - Method in class org.apache.spark.ml.feature.ElementwiseProduct
getSchedulingMode() - Method in class org.apache.spark.SparkContext: Return current scheduling mode
getSchema(Class<?>) - Method in class org.apache.spark.sql.SQLContext
getSeed() - Method in class org.apache.spark.mllib.clustering.BisectingKMeans: Gets the random seed.
getSeed() - Method in class org.apache.spark.mllib.clustering.GaussianMixture: Return the random seed
getSeed() - Method in class org.apache.spark.mllib.clustering.KMeans: The random seed for cluster initialization.
getSeed() - Method in class org.apache.spark.mllib.clustering.LDA: Random seed
getSeq(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i of array type as a Scala Seq.
getSerializer(Serializer) - Static method in class org.apache.spark.serializer.Serializer
getSerializer(Option<Serializer>) - Static method in class org.apache.spark.serializer.Serializer
getShort(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i as a primitive short.
getSizeAsBytes(String) - Method in class org.apache.spark.SparkConf: Get a size parameter as bytes; throws a NoSuchElementException if it's not set.
getSizeAsBytes(String, String) - Method in class org.apache.spark.SparkConf: Get a size parameter as bytes, falling back to a default if not set.
getSizeAsBytes(String, long) - Method in class org.apache.spark.SparkConf: Get a size parameter as bytes, falling back to a default if not set.
getSizeAsGb(String) - Method in class org.apache.spark.SparkConf: Get a size parameter as Gibibytes; throws a NoSuchElementException if it's not set.
getSizeAsGb(String, String) - Method in class org.apache.spark.SparkConf: Get a size parameter as Gibibytes, falling back to a default if not set.
getSizeAsKb(String) - Method in class org.apache.spark.SparkConf: Get a size parameter as Kibibytes; throws a NoSuchElementException if it's not set.
getSizeAsKb(String, String) - Method in class org.apache.spark.SparkConf: Get a size parameter as Kibibytes, falling back to a default if not set.
getSizeAsMb(String) - Method in class org.apache.spark.SparkConf: Get a size parameter as Mebibytes; throws a NoSuchElementException if it's not set.
getSizeAsMb(String, String) - Method in class org.apache.spark.SparkConf: Get a size parameter as Mebibytes, falling back to a default if not set.
getSparkHome() - Method in class org.apache.spark.api.java.JavaSparkContext: Get Spark's home location from either a value set through the constructor, or the spark.home Java property, or the SPARK_HOME environment variable (in that order of preference).
getSplits() - Method in class org.apache.spark.ml.feature.Bucketizer
getSQLDialect() - Method in class org.apache.spark.sql.hive.HiveContext
getSQLDialect() - Method in class org.apache.spark.sql.SQLContext
getStageInfo(int) - Method in class org.apache.spark.api.java.JavaSparkStatusTracker: Returns stage information, or null if the stage info could not be found or was garbage collected.
getStageInfo(int) - Method in class org.apache.spark.SparkStatusTracker: Returns stage information, or None if the stage info could not be found or was garbage collected.
getStages() - Method in class org.apache.spark.ml.Pipeline
getState() - Method in interface org.apache.spark.launcher.SparkAppHandle: Returns the current application state.
getState() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: :: DeveloperApi ::
getState() - Method in class org.apache.spark.streaming.StreamingContext: :: DeveloperApi ::
getStatement() - Method in class org.apache.spark.ml.feature.SQLTransformer
getStopWords() - Method in class org.apache.spark.ml.feature.StopWordsRemover
getStorageLevel() - Method in interface org.apache.spark.api.java.JavaRDDLike: Get the RDD's current storage level, or StorageLevel.NONE if none is set.
getStorageLevel() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
getStorageLevel() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
getStorageLevel() - Method in class org.apache.spark.rdd.RDD: Get the RDD's current storage level, or StorageLevel.NONE if none is set.
getString(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i as a String object.
getString(String) - Method in class org.apache.spark.sql.types.Metadata: Gets a String.
getStringArray(String) - Method in class org.apache.spark.sql.types.Metadata: Gets a String array.
getStruct(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i of struct type as an Row object.
getSubsamplingRate() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
getTableExistsQuery(String) - Method in class org.apache.spark.sql.jdbc.JdbcDialect: Get the SQL query that should be used to find if the given table exists.
getTableExistsQuery(String) - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
getTableExistsQuery(String) - Static method in class org.apache.spark.sql.jdbc.PostgresDialect
getTau0() - Method in class org.apache.spark.mllib.clustering.OnlineLDAOptimizer: A (positive) learning parameter that downweights early iterations.
getThreadLocal() - Static method in class org.apache.spark.SparkEnv: Returns the ThreadLocal SparkEnv.
getThreshold() - Method in class org.apache.spark.ml.classification.LogisticRegression
getThreshold() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
getThreshold() - Method in class org.apache.spark.ml.feature.Binarizer
getThreshold() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel: Returns the threshold (if any) used for converting raw prediction scores into 0/1 predictions.
getThreshold() - Method in class org.apache.spark.mllib.classification.SVMModel: Returns the threshold (if any) used for converting raw prediction scores into 0/1 predictions.
getThresholds() - Method in class org.apache.spark.ml.classification.LogisticRegression
getThresholds() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
getTimeAsMs(String) - Method in class org.apache.spark.SparkConf: Get a time parameter as milliseconds; throws a NoSuchElementException if it's not set.
getTimeAsMs(String, String) - Method in class org.apache.spark.SparkConf: Get a time parameter as milliseconds, falling back to a default if not set.
getTimeAsSeconds(String) - Method in class org.apache.spark.SparkConf: Get a time parameter as seconds; throws a NoSuchElementException if it's not set.
getTimeAsSeconds(String, String) - Method in class org.apache.spark.SparkConf: Get a time parameter as seconds, falling back to a default if not set.
getTimestamp(int) - Method in interface org.apache.spark.sql.Row: Returns the value at position i of date type as java.sql.Timestamp.
gettingResult() - Method in class org.apache.spark.scheduler.TaskInfo
gettingResultTime() - Method in class org.apache.spark.scheduler.TaskInfo: The time when the task started remotely getting the result.
getToLowercase() - Method in class org.apache.spark.ml.feature.RegexTokenizer
getTopicConcentration() - Method in class org.apache.spark.mllib.clustering.LDA: Concentration parameter (commonly named "beta" or "eta") for the prior placed on topics' distributions over terms.
getTreeStrategy() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
getUseNodeIdCache() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
getValidationTol() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
getValue() - Method in class org.apache.spark.broadcast.Broadcast: Actually get the broadcasted value.
getValue(int) - Method in class org.apache.spark.ml.attribute.NominalAttribute: Gets a value given its index.
getValuesMap(Seq<String>) - Method in interface org.apache.spark.sql.Row: Returns a Map(name -> value) for the requested fieldNames For primitive types if value is null it returns 'zero value' specific for primitive ie.
getVectors() - Method in class org.apache.spark.ml.feature.Word2VecModel: Returns a dataframe with two fields, "word" and "vector", with "word" being a String and and the vector the DenseVector that it is mapped to.
getVectors() - Method in class org.apache.spark.mllib.feature.Word2VecModel
Gini - Class in org.apache.spark.mllib.tree.impurity: :: Experimental :: Class for calculating the Gini impurity during binary classification.
Gini() - Constructor for class org.apache.spark.mllib.tree.impurity.Gini
globalTopicTotals() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
globalTopicTotals() - Method in class org.apache.spark.mllib.clustering.EMLDAOptimizer: Aggregate distributions over topics from all term vertices.
glom() - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an RDD created by coalescing all elements within each partition into an array.
glom() - Method in class org.apache.spark.rdd.RDD: Return an RDD created by coalescing all elements within each partition into an array.
glom() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying glom() to each RDD of this DStream.
glom() - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD is generated by applying glom() to each RDD of this DStream.
gradient() - Method in class org.apache.spark.ml.classification.LogisticAggregator
gradient() - Method in class org.apache.spark.ml.regression.AFTAggregator
gradient() - Method in class org.apache.spark.ml.regression.LeastSquaresAggregator
Gradient - Class in org.apache.spark.mllib.optimization: :: DeveloperApi :: Class used to compute the gradient for a loss function, given a single data point.
Gradient() - Constructor for class org.apache.spark.mllib.optimization.Gradient
gradient(double, double) - Static method in class org.apache.spark.mllib.tree.loss.AbsoluteError: Method to calculate the gradients for the gradient boosting calculation for least absolute error calculation.
gradient(double, double) - Static method in class org.apache.spark.mllib.tree.loss.LogLoss: Method to calculate the loss gradients for the gradient boosting calculation for binary classification The gradient with respect to F(x) is: - 4 y / (1 + exp(2 y F(x)))
gradient(double, double) - Method in interface org.apache.spark.mllib.tree.loss.Loss: Method to calculate the gradients for the gradient boosting calculation.
gradient(double, double) - Static method in class org.apache.spark.mllib.tree.loss.SquaredError: Method to calculate the gradients for the gradient boosting calculation for least squares error calculation.
GradientBoostedTrees - Class in org.apache.spark.mllib.tree: A class that implements Stochastic Gradient Boosting for regression and binary classification.
GradientBoostedTrees(BoostingStrategy) - Constructor for class org.apache.spark.mllib.tree.GradientBoostedTrees
GradientBoostedTreesModel - Class in org.apache.spark.mllib.tree.model: Represents a gradient boosted trees model.
GradientBoostedTreesModel(Enumeration.Value, DecisionTreeModel[], double[]) - Constructor for class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
GradientDescent - Class in org.apache.spark.mllib.optimization: Class used to solve an optimization problem using Gradient Descent.
Graph<VD,ED> - Class in org.apache.spark.graphx: The Graph abstractly represents a graph with arbitrary objects associated with vertices and edges.
Graph(ClassTag<VD>, ClassTag<ED>) - Constructor for class org.apache.spark.graphx.Graph
graph() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
graph() - Method in class org.apache.spark.mllib.clustering.EMLDAOptimizer: The following fields will only be initialized through the initialize() method
graph() - Method in class org.apache.spark.streaming.dstream.DStream
graph() - Method in class org.apache.spark.streaming.StreamingContext
GraphGenerators - Class in org.apache.spark.graphx.util: A collection of graph generating functions.
GraphGenerators() - Constructor for class org.apache.spark.graphx.util.GraphGenerators
GraphImpl<VD,ED> - Class in org.apache.spark.graphx.impl: An implementation of Graph to support computation on graphs.
GraphImpl(VertexRDD<VD>, ReplicatedVertexView<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Constructor for class org.apache.spark.graphx.impl.GraphImpl
GraphImpl(ClassTag<VD>, ClassTag<ED>) - Constructor for class org.apache.spark.graphx.impl.GraphImpl: Default constructor is provided to support serialization
GraphKryoRegistrator - Class in org.apache.spark.graphx: Registers GraphX classes with Kryo for improved performance.
GraphKryoRegistrator() - Constructor for class org.apache.spark.graphx.GraphKryoRegistrator
GraphLoader - Class in org.apache.spark.graphx: Provides utilities for loading Graphs from files.
GraphLoader() - Constructor for class org.apache.spark.graphx.GraphLoader
GraphOps<VD,ED> - Class in org.apache.spark.graphx: Contains additional functionality for Graph.
GraphOps(Graph<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Constructor for class org.apache.spark.graphx.GraphOps
graphToGraphOps(Graph<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.Graph: Implicitly extracts the GraphOps member from a graph.
GraphXUtils - Class in org.apache.spark.graphx
GraphXUtils() - Constructor for class org.apache.spark.graphx.GraphXUtils
greater(Duration) - Method in class org.apache.spark.streaming.Duration
greater(Time) - Method in class org.apache.spark.streaming.Time
greaterEq(Duration) - Method in class org.apache.spark.streaming.Duration
greaterEq(Time) - Method in class org.apache.spark.streaming.Time
GreaterThan - Class in org.apache.spark.sql.sources: A filter that evaluates to true iff the attribute evaluates to a value greater than value.
GreaterThan(String, Object) - Constructor for class org.apache.spark.sql.sources.GreaterThan
GreaterThanOrEqual - Class in org.apache.spark.sql.sources: A filter that evaluates to true iff the attribute evaluates to a value greater than or equal to value.
GreaterThanOrEqual(String, Object) - Constructor for class org.apache.spark.sql.sources.GreaterThanOrEqual
greatest(Column...) - Static method in class org.apache.spark.sql.functions: Returns the greatest value of the list of values, skipping null values.
greatest(String, String...) - Static method in class org.apache.spark.sql.functions: Returns the greatest value of the list of column names, skipping null values.
greatest(Seq<Column>) - Static method in class org.apache.spark.sql.functions: Returns the greatest value of the list of values, skipping null values.
greatest(String, Seq<String>) - Static method in class org.apache.spark.sql.functions: Returns the greatest value of the list of column names, skipping null values.
gridGraph(SparkContext, int, int) - Static method in class org.apache.spark.graphx.util.GraphGenerators: Create rows by cols grid graph with each vertex connected to its row+1 and col+1 neighbors.
groupArr() - Method in class org.apache.spark.rdd.PartitionCoalescer
groupBy(Function<T, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an RDD of grouped elements.
groupBy(Function<T, U>, int) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an RDD of grouped elements.
groupBy(Function1<T, K>, ClassTag<K>) - Method in class org.apache.spark.rdd.RDD: Return an RDD of grouped items.
groupBy(Function1<T, K>, int, ClassTag<K>) - Method in class org.apache.spark.rdd.RDD: Return an RDD of grouped elements.
groupBy(Function1<T, K>, Partitioner, ClassTag<K>, Ordering<K>) - Method in class org.apache.spark.rdd.RDD: Return an RDD of grouped items.
groupBy(Column...) - Method in class org.apache.spark.sql.DataFrame: Groups the DataFrame using the specified columns, so we can run aggregation on them.
groupBy(String, String...) - Method in class org.apache.spark.sql.DataFrame: Groups the DataFrame using the specified columns, so we can run aggregation on them.
groupBy(Seq<Column>) - Method in class org.apache.spark.sql.DataFrame: Groups the DataFrame using the specified columns, so we can run aggregation on them.
groupBy(String, Seq<String>) - Method in class org.apache.spark.sql.DataFrame: Groups the DataFrame using the specified columns, so we can run aggregation on them.
groupBy(Column...) - Method in class org.apache.spark.sql.Dataset: Returns a GroupedDataset where the data is grouped by the given Column expressions.
groupBy(Function1<T, K>, Encoder<K>) - Method in class org.apache.spark.sql.Dataset: (Scala-specific) Returns a GroupedDataset where the data is grouped by the given key func.
groupBy(Seq<Column>) - Method in class org.apache.spark.sql.Dataset: Returns a GroupedDataset where the data is grouped by the given Column expressions.
groupBy(MapFunction<T, K>, Encoder<K>) - Method in class org.apache.spark.sql.Dataset: (Java-specific) Returns a GroupedDataset where the data is grouped by the given key func.
groupByKey(Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Group the values for each key in the RDD into a single sequence.
groupByKey(int) - Method in class org.apache.spark.api.java.JavaPairRDD: Group the values for each key in the RDD into a single sequence.
groupByKey() - Method in class org.apache.spark.api.java.JavaPairRDD: Group the values for each key in the RDD into a single sequence.
groupByKey(Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions: Group the values for each key in the RDD into a single sequence.
groupByKey(int) - Method in class org.apache.spark.rdd.PairRDDFunctions: Group the values for each key in the RDD into a single sequence.
groupByKey() - Method in class org.apache.spark.rdd.PairRDDFunctions: Group the values for each key in the RDD into a single sequence.
groupByKey() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying groupByKey to each RDD.
groupByKey(int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying groupByKey to each RDD.
groupByKey(Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying groupByKey on each RDD of this DStream.
groupByKey() - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying groupByKey to each RDD.
groupByKey(int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying groupByKey to each RDD.
groupByKey(Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying groupByKey on each RDD.
groupByKeyAndWindow(Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying groupByKey over a sliding window.
groupByKeyAndWindow(Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying groupByKey over a sliding window.
groupByKeyAndWindow(Duration, Duration, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying groupByKey over a sliding window on this DStream.
groupByKeyAndWindow(Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying groupByKey over a sliding window on this DStream.
groupByKeyAndWindow(Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying groupByKey over a sliding window.
groupByKeyAndWindow(Duration, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying groupByKey over a sliding window.
groupByKeyAndWindow(Duration, Duration, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying groupByKey over a sliding window on this DStream.
groupByKeyAndWindow(Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Create a new DStream by applying groupByKey over a sliding window on this DStream.
GroupedData - Class in org.apache.spark.sql: :: Experimental :: A set of methods for aggregations on a DataFrame, created by DataFrame.groupBy.
GroupedData(DataFrame, Seq<Expression>, GroupedData.GroupType) - Constructor for class org.apache.spark.sql.GroupedData
GroupedDataset<K,V> - Class in org.apache.spark.sql: :: Experimental :: A Dataset has been logically grouped by a user specified grouping key.
groupEdges(Function2<ED, ED, ED>) - Method in class org.apache.spark.graphx.Graph: Merges multiple edges between two vertices into a single edge.
groupEdges(Function2<ED, ED, ED>) - Method in class org.apache.spark.graphx.impl.GraphImpl
groupHash() - Method in class org.apache.spark.rdd.PartitionCoalescer
groupWith(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD: Alias for cogroup.
groupWith(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>) - Method in class org.apache.spark.api.java.JavaPairRDD: Alias for cogroup.
groupWith(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>) - Method in class org.apache.spark.api.java.JavaPairRDD: Alias for cogroup.
groupWith(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Alias for cogroup.
groupWith(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Alias for cogroup.
groupWith(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Alias for cogroup.
gt(double) - Static method in class org.apache.spark.ml.param.ParamValidators: Check if value > lowerBound
gt(Object) - Method in class org.apache.spark.sql.Column: Greater than.
gtEq(double) - Static method in class org.apache.spark.ml.param.ParamValidators: Check if value >= lowerBound

H

hadoopConfiguration() - Method in class org.apache.spark.api.java.JavaSparkContext: Returns the Hadoop configuration used for the Hadoop code (e.g.
hadoopConfiguration() - Method in class org.apache.spark.SparkContext: A default Hadoop Configuration for the Hadoop code (e.g.
hadoopFile(String, Class<F>, Class<K>, Class<V>, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Get an RDD for a Hadoop file with an arbitrary InputFormat.
hadoopFile(String, Class<F>, Class<K>, Class<V>) - Method in class org.apache.spark.api.java.JavaSparkContext: Get an RDD for a Hadoop file with an arbitrary InputFormat
hadoopFile(String, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, int) - Method in class org.apache.spark.SparkContext: Get an RDD for a Hadoop file with an arbitrary InputFormat
hadoopFile(String, int, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.SparkContext: Smarter version of hadoopFile() that uses class tags to figure out the classes of keys, values and the InputFormat so that users don't need to pass them directly.
hadoopFile(String, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.SparkContext: Smarter version of hadoopFile() that uses class tags to figure out the classes of keys, values and the InputFormat so that users don't need to pass them directly.
HadoopFsRelation - Class in org.apache.spark.sql.sources: ::Experimental:: A BaseRelation that provides much of the common code required for relations that store their data to an HDFS compatible filesystem.
HadoopFsRelation() - Constructor for class org.apache.spark.sql.sources.HadoopFsRelation
HadoopFsRelation(Map<String, String>) - Constructor for class org.apache.spark.sql.sources.HadoopFsRelation
HadoopFsRelation.FakeFileStatus - Class in org.apache.spark.sql.sources
HadoopFsRelation.FakeFileStatus(String, long, boolean, short, long, long, long) - Constructor for class org.apache.spark.sql.sources.HadoopFsRelation.FakeFileStatus
HadoopFsRelation.FakeFileStatus$ - Class in org.apache.spark.sql.sources
HadoopFsRelation.FakeFileStatus$() - Constructor for class org.apache.spark.sql.sources.HadoopFsRelation.FakeFileStatus$
HadoopFsRelationProvider - Interface in org.apache.spark.sql.sources: ::Experimental:: Implemented by objects that produce relations for a specific kind of data source with a given schema and partitioned columns.
hadoopJobMetadata() - Method in class org.apache.spark.SparkEnv
hadoopRDD(JobConf, Class<F>, Class<K>, Class<V>, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Get an RDD for a Hadoop-readable dataset from a Hadooop JobConf giving its InputFormat and any other necessary info (e.g.
hadoopRDD(JobConf, Class<F>, Class<K>, Class<V>) - Method in class org.apache.spark.api.java.JavaSparkContext: Get an RDD for a Hadoop-readable dataset from a Hadooop JobConf giving its InputFormat and any other necessary info (e.g.
HadoopRDD<K,V> - Class in org.apache.spark.rdd: :: DeveloperApi :: An RDD that provides core functionality for reading data stored in Hadoop (e.g., files in HDFS, sources in HBase, or S3), using the older MapReduce API (org.apache.hadoop.mapred).
HadoopRDD(SparkContext, Broadcast<org.apache.spark.util.SerializableConfiguration>, Option<Function1<JobConf, BoxedUnit>>, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, int) - Constructor for class org.apache.spark.rdd.HadoopRDD
HadoopRDD(SparkContext, JobConf, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, int) - Constructor for class org.apache.spark.rdd.HadoopRDD
hadoopRDD(JobConf, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, int) - Method in class org.apache.spark.SparkContext: Get an RDD for a Hadoop-readable dataset from a Hadoop JobConf given its InputFormat and other necessary info (e.g.
hammingLoss() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics: Returns Hamming-loss
handle(Signal) - Method in class org.apache.spark.util.SignalLoggerHandler
hasAttr(String) - Method in class org.apache.spark.ml.attribute.AttributeGroup: Test whether this attribute group contains a specific attribute.
hasDefault(Param<T>) - Method in interface org.apache.spark.ml.param.Params: Tests whether the input param has a default value set.
hashCode() - Method in class org.apache.spark.graphx.EdgeDirection
hashCode() - Method in class org.apache.spark.HashPartitioner
hashCode() - Method in class org.apache.spark.ml.attribute.AttributeGroup
hashCode() - Method in class org.apache.spark.ml.attribute.BinaryAttribute
hashCode() - Method in class org.apache.spark.ml.attribute.NominalAttribute
hashCode() - Method in class org.apache.spark.ml.attribute.NumericAttribute
hashCode() - Method in class org.apache.spark.ml.param.Param
hashCode() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
hashCode() - Method in class org.apache.spark.mllib.linalg.DenseVector
hashCode() - Method in class org.apache.spark.mllib.linalg.SparseVector
hashCode() - Method in interface org.apache.spark.mllib.linalg.Vector: Returns a hash code value for the vector.
hashCode() - Method in class org.apache.spark.mllib.linalg.VectorUDT
hashCode() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
hashCode() - Method in class org.apache.spark.mllib.tree.model.Predict
hashCode() - Method in interface org.apache.spark.Partition
hashCode() - Method in class org.apache.spark.RangePartitioner
hashCode() - Method in class org.apache.spark.scheduler.AccumulableInfo
hashCode() - Method in class org.apache.spark.scheduler.cluster.ExecutorInfo
hashCode() - Method in class org.apache.spark.scheduler.InputFormatInfo
hashCode() - Method in class org.apache.spark.scheduler.SplitInfo
hashCode() - Method in class org.apache.spark.sql.Column
hashCode() - Method in interface org.apache.spark.sql.Row
hashCode() - Method in class org.apache.spark.sql.types.Decimal
hashCode() - Method in class org.apache.spark.sql.types.Metadata
hashCode() - Method in class org.apache.spark.storage.BlockId
hashCode() - Method in class org.apache.spark.storage.BlockManagerId
hashCode() - Method in class org.apache.spark.storage.StorageLevel
hashCode() - Method in class org.apache.spark.streaming.kafka.Broker
hashCode() - Method in class org.apache.spark.streaming.kafka.OffsetRange
HashingTF - Class in org.apache.spark.ml.feature: :: Experimental :: Maps a sequence of terms to their term frequencies using the hashing trick.
HashingTF(String) - Constructor for class org.apache.spark.ml.feature.HashingTF
HashingTF() - Constructor for class org.apache.spark.ml.feature.HashingTF
HashingTF - Class in org.apache.spark.mllib.feature: Maps a sequence of terms to their term frequencies using the hashing trick.
HashingTF(int) - Constructor for class org.apache.spark.mllib.feature.HashingTF
HashingTF() - Constructor for class org.apache.spark.mllib.feature.HashingTF
HashPartitioner - Class in org.apache.spark: A Partitioner that implements hash-based partitioning using Java's Object.hashCode.
HashPartitioner(int) - Constructor for class org.apache.spark.HashPartitioner
hasNext() - Method in class org.apache.spark.InterruptibleIterator
hasNext() - Method in class org.apache.spark.rdd.PartitionCoalescer.LocationIterator
HasOffsetRanges - Interface in org.apache.spark.streaming.kafka: Represents any object that has a collection of OffsetRanges.
hasParam(String) - Method in interface org.apache.spark.ml.param.Params
hasParent() - Method in class org.apache.spark.ml.Model: Indicates whether this Model has a corresponding parent.
hasSummary() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel: Indicates whether a training summary exists for this model instance.
hasSummary() - Method in class org.apache.spark.ml.regression.LinearRegressionModel: Indicates whether a training summary exists for this model instance.
hasValue(String) - Method in class org.apache.spark.ml.attribute.NominalAttribute: Tests whether this attribute contains a specific value.
head(int) - Method in class org.apache.spark.sql.DataFrame: Returns the first n rows.
head() - Method in class org.apache.spark.sql.DataFrame: Returns the first row.
hex(Column) - Static method in class org.apache.spark.sql.functions: Computes hex value of the given column.
high() - Method in class org.apache.spark.partial.BoundedDouble
HingeGradient - Class in org.apache.spark.mllib.optimization: :: DeveloperApi :: Compute gradient and loss for a Hinge loss function, as used in SVM binary classification.
HingeGradient() - Constructor for class org.apache.spark.mllib.optimization.HingeGradient
histogram(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Compute a histogram of the data using bucketCount number of buckets evenly spaced between the minimum and maximum of the RDD.
histogram(double[]) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Compute a histogram using the provided buckets.
histogram(Double[], boolean) - Method in class org.apache.spark.api.java.JavaDoubleRDD
histogram(int) - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Compute a histogram of the data using bucketCount number of buckets evenly spaced between the minimum and maximum of the RDD.
histogram(double[], boolean) - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Compute a histogram using the provided buckets.
HIVE_EXECUTION_VERSION() - Static method in class org.apache.spark.sql.hive.HiveContext
HIVE_METASTORE_BARRIER_PREFIXES() - Static method in class org.apache.spark.sql.hive.HiveContext
HIVE_METASTORE_JARS() - Static method in class org.apache.spark.sql.hive.HiveContext
HIVE_METASTORE_SHARED_PREFIXES() - Static method in class org.apache.spark.sql.hive.HiveContext
HIVE_METASTORE_VERSION() - Static method in class org.apache.spark.sql.hive.HiveContext
HIVE_THRIFT_SERVER_ASYNC() - Static method in class org.apache.spark.sql.hive.HiveContext
hiveconf() - Method in class org.apache.spark.sql.hive.HiveContext: SQLConf and HiveConf contracts:
HiveContext - Class in org.apache.spark.sql.hive: An instance of the Spark SQL execution engine that integrates with data stored in Hive.
HiveContext(SparkContext) - Constructor for class org.apache.spark.sql.hive.HiveContext
HiveContext(JavaSparkContext) - Constructor for class org.apache.spark.sql.hive.HiveContext
HiveContext.QueryExecution - Class in org.apache.spark.sql.hive: Extends QueryExecution with hive specific features.
HiveContext.QueryExecution(LogicalPlan) - Constructor for class org.apache.spark.sql.hive.HiveContext.QueryExecution
hiveExecutionVersion() - Static method in class org.apache.spark.sql.hive.HiveContext: The version of hive used internally by Spark SQL.
hiveMetastoreBarrierPrefixes() - Method in class org.apache.spark.sql.hive.HiveContext: A comma separated list of class prefixes that should explicitly be reloaded for each version of Hive that Spark SQL is communicating with.
hiveMetastoreJars() - Method in class org.apache.spark.sql.hive.HiveContext: The location of the jars that should be used to instantiate the HiveMetastoreClient.
hiveMetastoreSharedPrefixes() - Method in class org.apache.spark.sql.hive.HiveContext: A comma separated list of class prefixes that should be loaded using the classloader that is shared between Spark SQL and a specific version of Hive.
hiveMetastoreVersion() - Method in class org.apache.spark.sql.hive.HiveContext: The version of the hive client that will be used to communicate with the metastore.
hiveThriftServerAsync() - Method in class org.apache.spark.sql.hive.HiveContext
hiveThriftServerSingleSession() - Method in class org.apache.spark.sql.hive.HiveContext
horzcat(Matrix[]) - Static method in class org.apache.spark.mllib.linalg.Matrices: Horizontally concatenate a sequence of matrices.
host() - Method in class org.apache.spark.scheduler.TaskInfo
host() - Method in class org.apache.spark.status.api.v1.TaskData
host() - Method in class org.apache.spark.storage.BlockManagerId
host() - Method in class org.apache.spark.streaming.kafka.Broker: Broker's hostname
hostLocation() - Method in class org.apache.spark.scheduler.SplitInfo
hostPort() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
hostPort() - Method in class org.apache.spark.storage.BlockManagerId
hour(Column) - Static method in class org.apache.spark.sql.functions: Extracts the hours as an integer from a given date/timestamp/string.
hours() - Static method in class org.apache.spark.scheduler.StatsReportListener
HttpBroadcastFactory - Class in org.apache.spark.broadcast: A BroadcastFactory implementation that uses a HTTP server as the broadcast mechanism.
HttpBroadcastFactory() - Constructor for class org.apache.spark.broadcast.HttpBroadcastFactory
hypot(Column, Column) - Static method in class org.apache.spark.sql.functions: Computes sqrt(a^2^ + b^2^) without intermediate overflow or underflow.
hypot(Column, String) - Static method in class org.apache.spark.sql.functions: Computes sqrt(a^2^ + b^2^) without intermediate overflow or underflow.
hypot(String, Column) - Static method in class org.apache.spark.sql.functions: Computes sqrt(a^2^ + b^2^) without intermediate overflow or underflow.
hypot(String, String) - Static method in class org.apache.spark.sql.functions: Computes sqrt(a^2^ + b^2^) without intermediate overflow or underflow.
hypot(Column, double) - Static method in class org.apache.spark.sql.functions: Computes sqrt(a^2^ + b^2^) without intermediate overflow or underflow.
hypot(String, double) - Static method in class org.apache.spark.sql.functions: Computes sqrt(a^2^ + b^2^) without intermediate overflow or underflow.
hypot(double, Column) - Static method in class org.apache.spark.sql.functions: Computes sqrt(a^2^ + b^2^) without intermediate overflow or underflow.
hypot(double, String) - Static method in class org.apache.spark.sql.functions: Computes sqrt(a^2^ + b^2^) without intermediate overflow or underflow.

I

i() - Method in class org.apache.spark.mllib.linalg.distributed.MatrixEntry
id() - Method in class org.apache.spark.Accumulable
id() - Method in interface org.apache.spark.api.java.JavaRDDLike: A unique ID for this RDD (within its SparkContext).
id() - Method in class org.apache.spark.broadcast.Broadcast
id() - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering.Assignment
id() - Method in class org.apache.spark.mllib.tree.model.Node
id() - Method in class org.apache.spark.rdd.RDD: A unique ID for this RDD (within its SparkContext).
id() - Method in class org.apache.spark.scheduler.AccumulableInfo
id() - Method in class org.apache.spark.scheduler.TaskInfo
id() - Method in class org.apache.spark.status.api.v1.AccumulableInfo
id() - Method in class org.apache.spark.status.api.v1.ApplicationInfo
id() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
id() - Method in class org.apache.spark.status.api.v1.RDDStorageInfo
id() - Method in class org.apache.spark.storage.RDDInfo
id() - Method in class org.apache.spark.streaming.dstream.InputDStream: This is an unique identifier for the input stream.
id() - Method in class org.apache.spark.streaming.scheduler.OutputOperationInfo
Identifiable - Interface in org.apache.spark.ml.util: :: DeveloperApi ::
IDF - Class in org.apache.spark.ml.feature: :: Experimental :: Compute the Inverse Document Frequency (IDF) given a collection of documents.
IDF(String) - Constructor for class org.apache.spark.ml.feature.IDF
IDF() - Constructor for class org.apache.spark.ml.feature.IDF
idf() - Method in class org.apache.spark.ml.feature.IDFModel
IDF - Class in org.apache.spark.mllib.feature: Inverse document frequency (IDF).
IDF(int) - Constructor for class org.apache.spark.mllib.feature.IDF
IDF() - Constructor for class org.apache.spark.mllib.feature.IDF
idf() - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator: Returns the current IDF vector.
idf() - Method in class org.apache.spark.mllib.feature.IDFModel
IDF.DocumentFrequencyAggregator - Class in org.apache.spark.mllib.feature: Document frequency aggregator.
IDF.DocumentFrequencyAggregator(int) - Constructor for class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
IDF.DocumentFrequencyAggregator() - Constructor for class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
IDFModel - Class in org.apache.spark.ml.feature
IDFModel - Class in org.apache.spark.mllib.feature: Represents an IDF model that can transform term frequency vectors.
implicits() - Method in class org.apache.spark.sql.SQLContext: Accessor for nested Scala object
impurity() - Method in class org.apache.spark.ml.tree.InternalNode
impurity() - Method in class org.apache.spark.ml.tree.LeafNode
impurity() - Method in class org.apache.spark.ml.tree.Node: Impurity measure at this node (for training data)
impurity() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
Impurity - Interface in org.apache.spark.mllib.tree.impurity: :: Experimental :: Trait for calculating information gain.
impurity() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
impurity() - Method in class org.apache.spark.mllib.tree.model.Node
impurityStats() - Method in class org.apache.spark.ml.tree.InternalNode
impurityStats() - Method in class org.apache.spark.ml.tree.LeafNode
In() - Static method in class org.apache.spark.graphx.EdgeDirection: Edges arriving at a vertex.
in(Object...) - Method in class org.apache.spark.sql.Column: Deprecated.
As of 1.5.0. Use isin. This will be removed in Spark 2.0.
in(Seq<Object>) - Method in class org.apache.spark.sql.Column: Deprecated.
As of 1.5.0. Use isin. This will be removed in Spark 2.0.
In - Class in org.apache.spark.sql.sources: A filter that evaluates to true iff the attribute evaluates to one of the values in the array.
In(String, Object[]) - Constructor for class org.apache.spark.sql.sources.In
inArray(Object) - Static method in class org.apache.spark.ml.param.ParamValidators: Check for value in an allowed set of values.
inArray(List<T>) - Static method in class org.apache.spark.ml.param.ParamValidators: Check for value in an allowed set of values.
inDegrees() - Method in class org.apache.spark.graphx.GraphOps: The in-degree of each vertex in the graph.
index() - Method in class org.apache.spark.ml.attribute.Attribute: Index of the attribute.
index() - Method in class org.apache.spark.ml.attribute.BinaryAttribute
index() - Method in class org.apache.spark.ml.attribute.NominalAttribute
index() - Method in class org.apache.spark.ml.attribute.NumericAttribute
index() - Static method in class org.apache.spark.ml.attribute.UnresolvedAttribute
index() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRow
index(int, int) - Method in interface org.apache.spark.mllib.linalg.Matrix: Return the index for the (i, j)-th element in the backing array.
index() - Method in interface org.apache.spark.Partition: Get the partition's index within its parent RDD
index() - Method in class org.apache.spark.scheduler.TaskInfo
index() - Method in class org.apache.spark.status.api.v1.TaskData
IndexedRow - Class in org.apache.spark.mllib.linalg.distributed: Represents a row of IndexedRowMatrix.
IndexedRow(long, Vector) - Constructor for class org.apache.spark.mllib.linalg.distributed.IndexedRow
IndexedRowMatrix - Class in org.apache.spark.mllib.linalg.distributed
IndexedRowMatrix(RDD<IndexedRow>, long, int) - Constructor for class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
IndexedRowMatrix(RDD<IndexedRow>) - Constructor for class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
indexOf(String) - Method in class org.apache.spark.ml.attribute.AttributeGroup: Index of an attribute specified by name.
indexOf(String) - Method in class org.apache.spark.ml.attribute.NominalAttribute: Index of a specific value.
indexOf(Object) - Method in class org.apache.spark.mllib.feature.HashingTF: Returns the index of the input term.
indexToLevel(int) - Static method in class org.apache.spark.mllib.tree.model.Node: Return the level of a tree which the given node is in.
IndexToString - Class in org.apache.spark.ml.feature
IndexToString() - Constructor for class org.apache.spark.ml.feature.IndexToString
indices() - Method in class org.apache.spark.ml.feature.VectorSlicer: An array of indices to select features from a vector column.
indices() - Method in class org.apache.spark.mllib.linalg.SparseVector
infoChanged(SparkAppHandle) - Method in interface org.apache.spark.launcher.SparkAppHandle.Listener: Callback for changes in any information that is not the handle's state.
InformationGainStats - Class in org.apache.spark.mllib.tree.model: :: DeveloperApi :: Information gain statistics for each split param: gain information gain value param: impurity current node impurity param: leftImpurity left node impurity param: rightImpurity right node impurity param: leftPredict left node predict param: rightPredict right node predict
InformationGainStats(double, double, double, double, Predict, Predict) - Constructor for class org.apache.spark.mllib.tree.model.InformationGainStats
initcap(Column) - Static method in class org.apache.spark.sql.functions: Returns a new string column by converting the first letter of each word to uppercase.
initConverter(StructType) - Method in class org.apache.spark.sql.sources.OutputWriter
initialHash() - Method in class org.apache.spark.rdd.PartitionCoalescer
initialize(boolean, SparkConf, org.apache.spark.SecurityManager) - Method in interface org.apache.spark.broadcast.BroadcastFactory
initialize(boolean, SparkConf, org.apache.spark.SecurityManager) - Method in class org.apache.spark.broadcast.HttpBroadcastFactory
initialize(boolean, SparkConf, org.apache.spark.SecurityManager) - Method in class org.apache.spark.broadcast.TorrentBroadcastFactory
initialize(RDD<Tuple2<Object, Vector>>, LDA) - Method in interface org.apache.spark.mllib.clustering.LDAOptimizer: Initializer for the optimizer.
initialize(MutableAggregationBuffer) - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction: Initializes the given aggregation buffer, i.e.
initializeIfNecessary() - Method in interface org.apache.spark.Logging
initializeLogging() - Method in interface org.apache.spark.Logging
initialState(RDD<Tuple2<KeyType, StateType>>) - Method in class org.apache.spark.streaming.StateSpec: Set the RDD containing the initial states that will be used by `mapWithState`
initialState(JavaPairRDD<KeyType, StateType>) - Method in class org.apache.spark.streaming.StateSpec: Set the RDD containing the initial states that will be used by `mapWithState`
initialValue() - Method in class org.apache.spark.Accumulator
initialValue() - Method in class org.apache.spark.partial.PartialResult
initLocalProperties() - Method in class org.apache.spark.SparkContext
InnerClosureFinder - Class in org.apache.spark.util
InnerClosureFinder(Set<Class<?>>) - Constructor for class org.apache.spark.util.InnerClosureFinder
innerJoin(EdgeRDD<ED2>, Function4<Object, Object, ED, ED2, ED3>, ClassTag<ED2>, ClassTag<ED3>) - Method in class org.apache.spark.graphx.EdgeRDD: Inner joins this EdgeRDD with another EdgeRDD, assuming both are partitioned using the same PartitionStrategy.
innerJoin(EdgeRDD<ED2>, Function4<Object, Object, ED, ED2, ED3>, ClassTag<ED2>, ClassTag<ED3>) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
innerJoin(RDD<Tuple2<Object, U>>, Function3<Object, VD, U, VD2>, ClassTag, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
innerJoin(RDD<Tuple2<Object, U>>, Function3<Object, VD, U, VD2>, ClassTag, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD: Inner joins this VertexRDD with an RDD containing vertex attribute pairs.
innerZipJoin(VertexRDD, Function3<Object, VD, U, VD2>, ClassTag, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
innerZipJoin(VertexRDD, Function3<Object, VD, U, VD2>, ClassTag, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD: Efficiently inner joins this VertexRDD with another VertexRDD sharing the same index.
input_file_name() - Static method in class org.apache.spark.sql.functions: Creates a string column for the file name of the current Spark task.
inputBytes() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
inputBytes() - Method in class org.apache.spark.status.api.v1.StageData
inputDStream() - Method in class org.apache.spark.streaming.api.java.JavaInputDStream
inputDStream() - Method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
InputDStream<T> - Class in org.apache.spark.streaming.dstream: This is the abstract base class for all input streams.
InputDStream(StreamingContext, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.InputDStream
inputFileName() - Static method in class org.apache.spark.sql.functions: Deprecated.
As of 1.6.0, replaced by input_file_name. This will be removed in Spark 2.0.
inputFiles() - Method in class org.apache.spark.sql.DataFrame: Returns a best-effort snapshot of the files that compose this DataFrame.
inputFiles() - Method in class org.apache.spark.sql.sources.HadoopFsRelation
inputFormatCacheKey() - Method in class org.apache.spark.rdd.HadoopRDD
inputFormatClazz() - Method in class org.apache.spark.scheduler.InputFormatInfo
inputFormatClazz() - Method in class org.apache.spark.scheduler.SplitInfo
InputFormatInfo - Class in org.apache.spark.scheduler: :: DeveloperApi :: Parses and holds information about inputFormat (and files) specified as a parameter.
InputFormatInfo(Configuration, Class<?>, String) - Constructor for class org.apache.spark.scheduler.InputFormatInfo
InputMetricDistributions - Class in org.apache.spark.status.api.v1
InputMetrics - Class in org.apache.spark.status.api.v1
inputMetrics() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
inputMetrics() - Method in class org.apache.spark.status.api.v1.TaskMetrics
inputRecords() - Method in class org.apache.spark.status.api.v1.StageData
inputSchema() - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction: A StructType represents data types of input arguments of this aggregate function.
inputStreamId() - Method in class org.apache.spark.streaming.scheduler.StreamInputInfo
inputTypes() - Method in class org.apache.spark.sql.UserDefinedFunction
inRange(double, double, boolean, boolean) - Static method in class org.apache.spark.ml.param.ParamValidators: Check for value in range lowerBound to upperBound.
inRange(double, double) - Static method in class org.apache.spark.ml.param.ParamValidators: Version of inRange() which uses inclusive be default: [lowerBound, upperBound]
insert(DataFrame, boolean) - Method in interface org.apache.spark.sql.sources.InsertableRelation
InsertableRelation - Interface in org.apache.spark.sql.sources: ::DeveloperApi:: A BaseRelation that can be used to insert data into it through the insert method.
insertInto(String, boolean) - Method in class org.apache.spark.sql.DataFrame: Deprecated.
As of 1.4.0, replaced by write().mode(SaveMode.Append|SaveMode.Overwrite).saveAsTable(tableName). This will be removed in Spark 2.0.
insertInto(String) - Method in class org.apache.spark.sql.DataFrame: Deprecated.
As of 1.4.0, replaced by write().mode(SaveMode.Append).saveAsTable(tableName). This will be removed in Spark 2.0.
insertInto(String) - Method in class org.apache.spark.sql.DataFrameWriter: Inserts the content of the DataFrame to the specified table.
insertIntoJDBC(String, String, boolean) - Method in class org.apache.spark.sql.DataFrame: Deprecated.
As of 1.4.0, replaced by write().jdbc(). This will be removed in Spark 2.0.
instance() - Static method in class org.apache.spark.mllib.tree.impurity.Entropy: Get this impurity instance.
instance() - Static method in class org.apache.spark.mllib.tree.impurity.Gini: Get this impurity instance.
instance() - Static method in class org.apache.spark.mllib.tree.impurity.Variance: Get this impurity instance.
INSTANCE - Static variable in class org.apache.spark.serializer.DummySerializerInstance
instr(Column, String) - Static method in class org.apache.spark.sql.functions: Locate the position of the first occurrence of substr column in the given string.
INT() - Static method in class org.apache.spark.sql.Encoders: An encoder for nullable int type.
intAccumulator(int) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulator integer variable, which tasks can "add" values to using the add method.
intAccumulator(int, String) - Method in class org.apache.spark.api.java.JavaSparkContext: Create an Accumulator integer variable, which tasks can "add" values to using the add method.
IntArrayParam - Class in org.apache.spark.ml.param: :: DeveloperApi :: Specialized version of Param[Array[Int} for Java.
IntArrayParam(Params, String, String, Function1<int[], Object>) - Constructor for class org.apache.spark.ml.param.IntArrayParam
IntArrayParam(Params, String, String) - Constructor for class org.apache.spark.ml.param.IntArrayParam
IntDecimal() - Static method in class org.apache.spark.sql.types.DecimalType
IntegerType - Static variable in class org.apache.spark.sql.types.DataTypes: Gets the IntegerType object.
IntegerType - Class in org.apache.spark.sql.types: :: DeveloperApi :: The data type representing Int values.
integral() - Method in class org.apache.spark.sql.types.ByteType
integral() - Method in class org.apache.spark.sql.types.IntegerType
integral() - Method in class org.apache.spark.sql.types.LongType
integral() - Method in class org.apache.spark.sql.types.ShortType
Interaction - Class in org.apache.spark.ml.feature: :: Experimental :: Implements the feature interaction transform.
Interaction(String) - Constructor for class org.apache.spark.ml.feature.Interaction
Interaction() - Constructor for class org.apache.spark.ml.feature.Interaction
intercept() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
intercept() - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
intercept() - Method in class org.apache.spark.ml.regression.LinearRegressionModel
intercept() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
intercept() - Method in class org.apache.spark.mllib.classification.SVMModel
intercept() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
intercept() - Method in class org.apache.spark.mllib.regression.LassoModel
intercept() - Method in class org.apache.spark.mllib.regression.LinearRegressionModel
intercept() - Method in class org.apache.spark.mllib.regression.RidgeRegressionModel
internal() - Method in class org.apache.spark.scheduler.AccumulableInfo
internalMetricsToAccumulators() - Method in class org.apache.spark.TaskContext: Accumulators for tracking internal metrics indexed by the name.
InternalNode - Class in org.apache.spark.ml.tree: :: DeveloperApi :: Internal Decision Tree node.
interpretedOrdering() - Method in class org.apache.spark.sql.types.ArrayType
interpretedOrdering() - Method in class org.apache.spark.sql.types.StructType
InterruptibleIterator<T> - Class in org.apache.spark: :: DeveloperApi :: An iterator that wraps around an existing iterator to provide task killing functionality.
InterruptibleIterator(TaskContext, Iterator<T>) - Constructor for class org.apache.spark.InterruptibleIterator
interruptThread() - Method in class org.apache.spark.scheduler.local.KillTask
intersect(DataFrame) - Method in class org.apache.spark.sql.DataFrame: Returns a new DataFrame containing rows only in both this frame and another frame.
intersect(Dataset<T>) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset that contains only the elements of this Dataset that are also present in other.
intersection(JavaDoubleRDD) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return the intersection of this RDD and another one.
intersection(JavaPairRDD<K, V>) - Method in class org.apache.spark.api.java.JavaPairRDD: Return the intersection of this RDD and another one.
intersection(JavaRDD<T>) - Method in class org.apache.spark.api.java.JavaRDD: Return the intersection of this RDD and another one.
intersection(RDD<T>) - Method in class org.apache.spark.rdd.RDD: Return the intersection of this RDD and another one.
intersection(RDD<T>, Partitioner, Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Return the intersection of this RDD and another one.
intersection(RDD<T>, int) - Method in class org.apache.spark.rdd.RDD: Return the intersection of this RDD and another one.
IntParam - Class in org.apache.spark.ml.param: :: DeveloperApi :: Specialized version of Param[Int] for Java.
IntParam(String, String, String, Function1<Object, Object>) - Constructor for class org.apache.spark.ml.param.IntParam
IntParam(String, String, String) - Constructor for class org.apache.spark.ml.param.IntParam
IntParam(Identifiable, String, String, Function1<Object, Object>) - Constructor for class org.apache.spark.ml.param.IntParam
IntParam(Identifiable, String, String) - Constructor for class org.apache.spark.ml.param.IntParam
intRddToDataFrameHolder(RDD<Object>) - Method in class org.apache.spark.sql.SQLImplicits: Creates a single column DataFrame from an RDD[Int].
intToIntWritable(int) - Static method in class org.apache.spark.SparkContext
intWritableConverter() - Static method in class org.apache.spark.SparkContext
invalidateTable(String) - Method in class org.apache.spark.sql.hive.HiveContext
invalidInformationGainStats() - Static method in class org.apache.spark.mllib.tree.model.InformationGainStats: An InformationGainStats object to denote that current split doesn't satisfies minimum info gain or minimum number of instances per node.
inverse() - Method in class org.apache.spark.ml.feature.DCT: Indicates whether to perform the inverse DCT (true) or forward DCT (false).
isAddIntercept() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm: Get if the algorithm uses addIntercept
isAkkaConf(String) - Static method in class org.apache.spark.SparkConf: Return whether the given config is an akka config (e.g.
isAllowed(Enumeration.Value, Enumeration.Value) - Static method in class org.apache.spark.scheduler.TaskLocality
isBroadcast() - Method in class org.apache.spark.storage.BlockId
isCached(String) - Method in class org.apache.spark.sql.SQLContext: Returns true if the table is currently cached in-memory.
isCached() - Method in class org.apache.spark.storage.BlockStatus
isCached() - Method in class org.apache.spark.storage.RDDInfo
isCancelled() - Method in class org.apache.spark.ComplexFutureAction
isCancelled() - Method in interface org.apache.spark.FutureAction: Returns whether the action has been cancelled.
isCancelled() - Method in class org.apache.spark.SimpleFutureAction
isCheckpointed() - Method in interface org.apache.spark.api.java.JavaRDDLike: Return whether this RDD has been checkpointed or not
isCheckpointed() - Method in class org.apache.spark.graphx.Graph: Return whether this Graph has been checkpointed or not.
isCheckpointed() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
isCheckpointed() - Method in class org.apache.spark.graphx.impl.GraphImpl
isCheckpointed() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
isCheckpointed() - Method in class org.apache.spark.rdd.RDD: Return whether this RDD is checkpointed and materialized, either reliably or locally.
isCheckpointPresent() - Method in class org.apache.spark.streaming.StreamingContext
isCompleted() - Method in class org.apache.spark.ComplexFutureAction
isCompleted() - Method in interface org.apache.spark.FutureAction: Returns whether the action has already been completed with a value or an exception.
isCompleted() - Method in class org.apache.spark.SimpleFutureAction
isCompleted() - Method in class org.apache.spark.TaskContext: Returns true if the task has completed.
isDefined(Param<?>) - Method in interface org.apache.spark.ml.param.Params
isDir() - Method in class org.apache.spark.sql.sources.HadoopFsRelation.FakeFileStatus
isDistributed() - Method in class org.apache.spark.ml.clustering.DistributedLDAModel
isDistributed() - Method in class org.apache.spark.ml.clustering.LDAModel: Indicates whether this instance is of type DistributedLDAModel
isDistributed() - Method in class org.apache.spark.ml.clustering.LocalLDAModel
isDriver() - Method in class org.apache.spark.storage.BlockManagerId
isEmpty() - Method in interface org.apache.spark.api.java.JavaRDDLike
isEmpty() - Method in class org.apache.spark.rdd.PartitionCoalescer.LocationIterator
isEmpty() - Method in class org.apache.spark.rdd.RDD
isExecutorStartupConf(String) - Static method in class org.apache.spark.SparkConf: Return whether the given config should be passed to an executor on start-up.
isExperiment() - Method in class org.apache.spark.mllib.stat.test.BinarySample
isFinal() - Method in enum org.apache.spark.launcher.SparkAppHandle.State: Whether this state is a final state, meaning the application is not running anymore once it's reached.
isin(Object...) - Method in class org.apache.spark.sql.Column: A boolean expression that is evaluated to true if the value of this expression is contained by the evaluated values of the arguments.
isin(Seq<Object>) - Method in class org.apache.spark.sql.Column: A boolean expression that is evaluated to true if the value of this expression is contained by the evaluated values of the arguments.
isInitialValueFinal() - Method in class org.apache.spark.partial.PartialResult
isInterrupted() - Method in class org.apache.spark.TaskContext: Returns true if the task has been killed.
isLargerBetter() - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
isLargerBetter() - Method in class org.apache.spark.ml.evaluation.Evaluator: Indicates whether the metric returned by evaluate() should be maximized (true, default) or minimized (false).
isLargerBetter() - Method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
isLargerBetter() - Method in class org.apache.spark.ml.evaluation.RegressionEvaluator
isLeaf() - Method in class org.apache.spark.mllib.tree.model.Node
isLeftChild(int) - Static method in class org.apache.spark.mllib.tree.model.Node: Returns true if this is a left child.
isLocal() - Method in class org.apache.spark.api.java.JavaSparkContext
isLocal() - Method in class org.apache.spark.SparkContext
isLocal() - Method in class org.apache.spark.sql.DataFrame: Returns true if the collect and take methods can be run locally (without any Spark executors).
isMulticlassClassification() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
isMulticlassWithCategoricalFeatures() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
isMultipleOf(Duration) - Method in class org.apache.spark.streaming.Duration
isMultipleOf(Duration) - Method in class org.apache.spark.streaming.Time
isNaN() - Method in class org.apache.spark.sql.Column: True if the current expression is NaN.
isNaN(Column) - Static method in class org.apache.spark.sql.functions: Deprecated.
As of 1.6.0, replaced by isnan. This will be removed in Spark 2.0.
isnan(Column) - Static method in class org.apache.spark.sql.functions: Return true iff the column is NaN.
isNominal() - Method in class org.apache.spark.ml.attribute.Attribute: Tests whether this attribute is nominal, true for NominalAttribute and BinaryAttribute.
isNominal() - Method in class org.apache.spark.ml.attribute.BinaryAttribute
isNominal() - Method in class org.apache.spark.ml.attribute.NominalAttribute
isNominal() - Method in class org.apache.spark.ml.attribute.NumericAttribute
isNominal() - Static method in class org.apache.spark.ml.attribute.UnresolvedAttribute
isNotNull() - Method in class org.apache.spark.sql.Column: True if the current expression is NOT null.
IsNotNull - Class in org.apache.spark.sql.sources: A filter that evaluates to true iff the attribute evaluates to a non-null value.
IsNotNull(String) - Constructor for class org.apache.spark.sql.sources.IsNotNull
isNull() - Method in class org.apache.spark.sql.Column: True if the current expression is null.
isnull(Column) - Static method in class org.apache.spark.sql.functions: Return true iff the column is null.
IsNull - Class in org.apache.spark.sql.sources: A filter that evaluates to true iff the attribute evaluates to null.
IsNull(String) - Constructor for class org.apache.spark.sql.sources.IsNull
isNullAt(int) - Method in interface org.apache.spark.sql.Row: Checks whether the value at position i is null.
isNumeric() - Method in class org.apache.spark.ml.attribute.Attribute: Tests whether this attribute is numeric, true for NumericAttribute and BinaryAttribute.
isNumeric() - Method in class org.apache.spark.ml.attribute.BinaryAttribute
isNumeric() - Method in class org.apache.spark.ml.attribute.NominalAttribute
isNumeric() - Method in class org.apache.spark.ml.attribute.NumericAttribute
isNumeric() - Static method in class org.apache.spark.ml.attribute.UnresolvedAttribute
isOrdinal() - Method in class org.apache.spark.ml.attribute.NominalAttribute
isotonic() - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
IsotonicRegression - Class in org.apache.spark.ml.regression
IsotonicRegression(String) - Constructor for class org.apache.spark.ml.regression.IsotonicRegression
IsotonicRegression() - Constructor for class org.apache.spark.ml.regression.IsotonicRegression
IsotonicRegression - Class in org.apache.spark.mllib.regression
IsotonicRegression() - Constructor for class org.apache.spark.mllib.regression.IsotonicRegression
IsotonicRegressionModel - Class in org.apache.spark.ml.regression: :: Experimental :: Model fitted by IsotonicRegression.
IsotonicRegressionModel - Class in org.apache.spark.mllib.regression: Regression model for isotonic regression.
IsotonicRegressionModel(double[], double[], boolean) - Constructor for class org.apache.spark.mllib.regression.IsotonicRegressionModel
IsotonicRegressionModel(Iterable<Object>, Iterable<Object>, Boolean) - Constructor for class org.apache.spark.mllib.regression.IsotonicRegressionModel: A Java-friendly constructor that takes two Iterable parameters and one Boolean parameter.
isRDD() - Method in class org.apache.spark.storage.BlockId
isRootContext() - Method in class org.apache.spark.sql.SQLContext
isRunningLocally() - Method in class org.apache.spark.TaskContext: Returns true if the task is running locally in the driver program.
isSet(Param<?>) - Method in interface org.apache.spark.ml.param.Params
isShuffle() - Method in class org.apache.spark.storage.BlockId
isSorted(int[]) - Method in class org.apache.spark.mllib.feature.ChiSqSelectorModel
isSparkPortConf(String) - Static method in class org.apache.spark.SparkConf: Return true if the given config matches either spark.*.port or spark.port.*.
isStarted() - Method in class org.apache.spark.streaming.receiver.Receiver: Check if the receiver has started or not.
isStopped() - Method in class org.apache.spark.SparkContext
isStopped() - Method in class org.apache.spark.SparkEnv
isStopped() - Method in class org.apache.spark.streaming.receiver.Receiver: Check if receiver has been marked for stopping.
isTimingOut() - Method in class org.apache.spark.streaming.State: Whether the state is timing out and going to be removed by the system after the current batch.
isTraceEnabled() - Method in interface org.apache.spark.Logging
isTransposed() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
isTransposed() - Method in interface org.apache.spark.mllib.linalg.Matrix: Flag that keeps track whether the matrix is transposed or not.
isTransposed() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
isValid() - Method in class org.apache.spark.ml.param.Param
isValid() - Method in class org.apache.spark.storage.StorageLevel
isZero() - Method in class org.apache.spark.sql.types.Decimal
isZero() - Method in class org.apache.spark.streaming.Duration
it() - Method in class org.apache.spark.rdd.PartitionCoalescer.LocationIterator
item() - Method in class org.apache.spark.ml.recommendation.ALS.Rating
itemFactors() - Method in class org.apache.spark.ml.recommendation.ALSModel
items() - Method in class org.apache.spark.mllib.fpm.FPGrowth.FreqItemset
iterationTimes() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
iterator(Partition, TaskContext) - Method in interface org.apache.spark.api.java.JavaRDDLike: Internal method to this RDD; will read from cache if applicable, or otherwise compute it.
iterator(Partition, TaskContext) - Method in class org.apache.spark.rdd.RDD: Internal method to this RDD; will read from cache if applicable, or otherwise compute it.
iterator() - Method in class org.apache.spark.sql.types.StructType

J

j() - Method in class org.apache.spark.mllib.linalg.distributed.MatrixEntry
jarOfClass(Class<?>) - Static method in class org.apache.spark.api.java.JavaSparkContext: Find the JAR from which a given class was loaded, to make it easy for users to pass their JARs to SparkContext.
jarOfClass(Class<?>) - Static method in class org.apache.spark.SparkContext: Find the JAR from which a given class was loaded, to make it easy for users to pass their JARs to SparkContext.
jarOfClass(Class<?>) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Find the JAR from which a given class was loaded, to make it easy for users to pass their JARs to StreamingContext.
jarOfClass(Class<?>) - Static method in class org.apache.spark.streaming.StreamingContext: Find the JAR from which a given class was loaded, to make it easy for users to pass their JARs to StreamingContext.
jarOfObject(Object) - Static method in class org.apache.spark.api.java.JavaSparkContext: Find the JAR that contains the class of a particular object, to make it easy for users to pass their JARs to SparkContext.
jarOfObject(Object) - Static method in class org.apache.spark.SparkContext: Find the JAR that contains the class of a particular object, to make it easy for users to pass their JARs to SparkContext.
jars() - Method in class org.apache.spark.api.java.JavaSparkContext
jars() - Method in class org.apache.spark.SparkContext
javaAntecedent() - Method in class org.apache.spark.mllib.fpm.AssociationRules.Rule: Returns antecedent in a Java List.
javaCategoryMaps() - Method in class org.apache.spark.ml.feature.VectorIndexerModel: Java-friendly version of categoryMaps
javaConsequent() - Method in class org.apache.spark.mllib.fpm.AssociationRules.Rule: Returns consequent in a Java List.
JavaDoubleRDD - Class in org.apache.spark.api.java
JavaDoubleRDD(RDD<Object>) - Constructor for class org.apache.spark.api.java.JavaDoubleRDD
JavaDStream<T> - Class in org.apache.spark.streaming.api.java: A Java-friendly interface to DStream, the basic abstraction in Spark Streaming that represents a continuous stream of data.
JavaDStream(DStream<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.api.java.JavaDStream
JavaDStreamLike<T,This extends JavaDStreamLike<T,This,R>,R extends JavaRDDLike<T,R>> - Interface in org.apache.spark.streaming.api.java
JavaFutureAction<T> - Interface in org.apache.spark.api.java
JavaHadoopRDD<K,V> - Class in org.apache.spark.api.java
JavaHadoopRDD(HadoopRDD<K, V>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.api.java.JavaHadoopRDD
JavaInputDStream<T> - Class in org.apache.spark.streaming.api.java: A Java-friendly interface to InputDStream.
JavaInputDStream(InputDStream<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.api.java.JavaInputDStream
javaItems() - Method in class org.apache.spark.mllib.fpm.FPGrowth.FreqItemset: Returns items in a Java List.
JavaIterableWrapperSerializer - Class in org.apache.spark.serializer: A Kryo serializer for serializing results returned by asJavaIterable.
JavaIterableWrapperSerializer() - Constructor for class org.apache.spark.serializer.JavaIterableWrapperSerializer
JavaMapWithStateDStream<KeyType,ValueType,StateType,MappedType> - Class in org.apache.spark.streaming.api.java: :: Experimental :: DStream representing the stream of data generated by mapWithState operation on a JavaPairDStream.
JavaNewHadoopRDD<K,V> - Class in org.apache.spark.api.java
JavaNewHadoopRDD(NewHadoopRDD<K, V>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.api.java.JavaNewHadoopRDD
JavaPairDStream<K,V> - Class in org.apache.spark.streaming.api.java: A Java-friendly interface to a DStream of key-value pairs, which provides extra methods like reduceByKey and join.
JavaPairDStream(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.streaming.api.java.JavaPairDStream
JavaPairInputDStream<K,V> - Class in org.apache.spark.streaming.api.java: A Java-friendly interface to InputDStream of key-value pairs.
JavaPairInputDStream(InputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.streaming.api.java.JavaPairInputDStream
JavaPairRDD<K,V> - Class in org.apache.spark.api.java
JavaPairRDD(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.api.java.JavaPairRDD
JavaPairReceiverInputDStream<K,V> - Class in org.apache.spark.streaming.api.java: A Java-friendly interface to ReceiverInputDStream, the abstract class for defining any input stream that receives data over the network.
JavaPairReceiverInputDStream(ReceiverInputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
JavaParams - Class in org.apache.spark.ml.param: :: DeveloperApi :: Java-friendly wrapper for Params.
JavaParams() - Constructor for class org.apache.spark.ml.param.JavaParams
JavaRDD<T> - Class in org.apache.spark.api.java
JavaRDD(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.api.java.JavaRDD
javaRDD() - Method in class org.apache.spark.sql.DataFrame: Returns the content of the DataFrame as a JavaRDD of Rows.
JavaRDDLike<T,This extends JavaRDDLike<T,This>> - Interface in org.apache.spark.api.java: Defines operations common to several Java RDD implementations.
JavaReceiverInputDStream<T> - Class in org.apache.spark.streaming.api.java: A Java-friendly interface to ReceiverInputDStream, the abstract class for defining any input stream that receives data over the network.
JavaReceiverInputDStream(ReceiverInputDStream<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
javaSequence() - Method in class org.apache.spark.mllib.fpm.PrefixSpan.FreqSequence: Returns sequence as a Java List of lists for Java users.
javaSerialization(ClassTag<T>) - Static method in class org.apache.spark.sql.Encoders: (Scala-specific) Creates an encoder that serializes objects of type T using generic Java serialization.
javaSerialization(Class<T>) - Static method in class org.apache.spark.sql.Encoders: Creates an encoder that serializes objects of type T using generic Java serialization.
JavaSerializer - Class in org.apache.spark.serializer: :: DeveloperApi :: A Spark serializer that uses Java's built-in serialization.
JavaSerializer(SparkConf) - Constructor for class org.apache.spark.serializer.JavaSerializer
JavaSerializer() - Constructor for class org.apache.spark.serializer.JavaSerializer
JavaSparkContext - Class in org.apache.spark.api.java: A Java-friendly version of SparkContext that returns JavaRDDs and works with Java collections instead of Scala ones.
JavaSparkContext(SparkContext) - Constructor for class org.apache.spark.api.java.JavaSparkContext
JavaSparkContext() - Constructor for class org.apache.spark.api.java.JavaSparkContext: Create a JavaSparkContext that loads settings from system properties (for instance, when launching with ./bin/spark-submit).
JavaSparkContext(SparkConf) - Constructor for class org.apache.spark.api.java.JavaSparkContext
JavaSparkContext(String, String) - Constructor for class org.apache.spark.api.java.JavaSparkContext
JavaSparkContext(String, String, SparkConf) - Constructor for class org.apache.spark.api.java.JavaSparkContext
JavaSparkContext(String, String, String, String) - Constructor for class org.apache.spark.api.java.JavaSparkContext
JavaSparkContext(String, String, String, String[]) - Constructor for class org.apache.spark.api.java.JavaSparkContext
JavaSparkContext(String, String, String, String[], Map<String, String>) - Constructor for class org.apache.spark.api.java.JavaSparkContext
JavaSparkListener - Class in org.apache.spark: Java clients should extend this class instead of implementing SparkListener directly.
JavaSparkListener() - Constructor for class org.apache.spark.JavaSparkListener
JavaSparkStatusTracker - Class in org.apache.spark.api.java: Low-level status reporting APIs for monitoring job and stage progress.
JavaStreamingContext - Class in org.apache.spark.streaming.api.java: A Java-friendly version of StreamingContext which is the main entry point for Spark Streaming functionality.
JavaStreamingContext(StreamingContext) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext
JavaStreamingContext(String, String, Duration) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a StreamingContext.
JavaStreamingContext(String, String, Duration, String, String) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a StreamingContext.
JavaStreamingContext(String, String, Duration, String, String[]) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a StreamingContext.
JavaStreamingContext(String, String, Duration, String, String[], Map<String, String>) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a StreamingContext.
JavaStreamingContext(JavaSparkContext, Duration) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a JavaStreamingContext using an existing JavaSparkContext.
JavaStreamingContext(SparkConf, Duration) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a JavaStreamingContext using a SparkConf configuration.
JavaStreamingContext(String) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext: Recreate a JavaStreamingContext from a checkpoint file.
JavaStreamingContext(String, Configuration) - Constructor for class org.apache.spark.streaming.api.java.JavaStreamingContext: Re-creates a JavaStreamingContext from a checkpoint file.
JavaStreamingContextFactory - Interface in org.apache.spark.streaming.api.java: Factory interface for creating a new JavaStreamingContext
javaTopicAssignments() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel: Java-friendly version of topicAssignments
javaTopicDistributions() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel: Java-friendly version of topicDistributions
javaTopTopicsPerDocument(int) - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel: Java-friendly version of topTopicsPerDocument
javaToPython() - Method in class org.apache.spark.sql.DataFrame: Converts a JavaRDD to a PythonRDD.
jdbc(String, String, Properties) - Method in class org.apache.spark.sql.DataFrameReader: Construct a DataFrame representing the database table accessible via JDBC URL url named table and connection properties.
jdbc(String, String, String, long, long, int, Properties) - Method in class org.apache.spark.sql.DataFrameReader: Construct a DataFrame representing the database table accessible via JDBC URL url named table.
jdbc(String, String, String[], Properties) - Method in class org.apache.spark.sql.DataFrameReader: Construct a DataFrame representing the database table accessible via JDBC URL url named table using connection properties.
jdbc(String, String, Properties) - Method in class org.apache.spark.sql.DataFrameWriter: Saves the content of the DataFrame to a external database table via JDBC.
jdbc(String, String) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
As of 1.4.0, replaced by read().jdbc(). This will be removed in Spark 2.0.
jdbc(String, String, String, long, long, int) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
As of 1.4.0, replaced by read().jdbc(). This will be removed in Spark 2.0.
jdbc(String, String, String[]) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
As of 1.4.0, replaced by read().jdbc(). This will be removed in Spark 2.0.
JdbcDialect - Class in org.apache.spark.sql.jdbc: :: DeveloperApi :: Encapsulates everything (extensions, workarounds, quirks) to handle the SQL dialect of a certain database or jdbc driver.
JdbcDialect() - Constructor for class org.apache.spark.sql.jdbc.JdbcDialect
JdbcDialects - Class in org.apache.spark.sql.jdbc: :: DeveloperApi :: Registry of dialects that apply to every new jdbc DataFrame.
JdbcDialects() - Constructor for class org.apache.spark.sql.jdbc.JdbcDialects
jdbcNullType() - Method in class org.apache.spark.sql.jdbc.JdbcType
JdbcRDD<T> - Class in org.apache.spark.rdd: An RDD that executes an SQL query on a JDBC connection and reads results.
JdbcRDD(SparkContext, Function0<Connection>, String, long, long, int, Function1<ResultSet, T>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.JdbcRDD
JdbcRDD.ConnectionFactory - Interface in org.apache.spark.rdd
JdbcType - Class in org.apache.spark.sql.jdbc: :: DeveloperApi :: A database type definition coupled with the jdbc type needed to send null values to the database.
JdbcType(String, int) - Constructor for class org.apache.spark.sql.jdbc.JdbcType
jobConfCacheKey() - Method in class org.apache.spark.rdd.HadoopRDD
JobData - Class in org.apache.spark.status.api.v1
JobExecutionStatus - Enum in org.apache.spark
jobGroup() - Method in class org.apache.spark.status.api.v1.JobData
jobGroupToJobIds() - Method in class org.apache.spark.ui.jobs.JobProgressListener
jobId() - Method in class org.apache.spark.rdd.NewHadoopRDD
jobId() - Method in class org.apache.spark.scheduler.SparkListenerJobEnd
jobId() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
jobId() - Method in interface org.apache.spark.SparkJobInfo
jobId() - Method in class org.apache.spark.SparkJobInfoImpl
jobId() - Method in class org.apache.spark.status.api.v1.JobData
jobID() - Method in class org.apache.spark.TaskCommitDenied
jobIds() - Method in interface org.apache.spark.api.java.JavaFutureAction: Returns the job IDs run by the underlying async operation.
jobIds() - Method in class org.apache.spark.ComplexFutureAction
jobIds() - Method in interface org.apache.spark.FutureAction: Returns the job IDs run by the underlying async operation.
jobIds() - Method in class org.apache.spark.SimpleFutureAction
jobIdToData() - Method in class org.apache.spark.ui.jobs.JobProgressListener
JobLogger - Class in org.apache.spark.scheduler: :: DeveloperApi :: A logger class to record runtime information for jobs in Spark.
JobLogger(String, String) - Constructor for class org.apache.spark.scheduler.JobLogger
JobLogger() - Constructor for class org.apache.spark.scheduler.JobLogger
jobLogInfo(int, String, boolean) - Method in class org.apache.spark.scheduler.JobLogger: Write info into log file
JobProgressListener - Class in org.apache.spark.ui.jobs: :: DeveloperApi :: Tracks task-level information to be displayed in the UI.
JobProgressListener(SparkConf) - Constructor for class org.apache.spark.ui.jobs.JobProgressListener
JobResult - Interface in org.apache.spark.scheduler: :: DeveloperApi :: A result of a job in the DAGScheduler.
jobResult() - Method in class org.apache.spark.scheduler.SparkListenerJobEnd
JobSucceeded - Class in org.apache.spark.scheduler
JobSucceeded() - Constructor for class org.apache.spark.scheduler.JobSucceeded
join(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Merge the values for each key using an associative reduce function.
join(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD containing all pairs of elements with matching keys in this and other.
join(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD containing all pairs of elements with matching keys in this and other.
join(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return an RDD containing all pairs of elements with matching keys in this and other.
join(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return an RDD containing all pairs of elements with matching keys in this and other.
join(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return an RDD containing all pairs of elements with matching keys in this and other.
join(DataFrame) - Method in class org.apache.spark.sql.DataFrame: Cartesian join with another DataFrame.
join(DataFrame, String) - Method in class org.apache.spark.sql.DataFrame: Inner equi-join with another DataFrame using the given column.
join(DataFrame, Seq<String>) - Method in class org.apache.spark.sql.DataFrame: Inner equi-join with another DataFrame using the given columns.
join(DataFrame, Seq<String>, String) - Method in class org.apache.spark.sql.DataFrame: Equi-join with another DataFrame using the given columns.
join(DataFrame, Column) - Method in class org.apache.spark.sql.DataFrame: Inner join with another DataFrame, using the given join expression.
join(DataFrame, Column, String) - Method in class org.apache.spark.sql.DataFrame: Join with another DataFrame, using the given join expression.
join(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
join(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
join(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
join(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
join(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
join(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'join' between RDDs of this DStream and other DStream.
joinVertices(RDD<Tuple2<Object, U>>, Function3<Object, VD, U, VD>, ClassTag) - Method in class org.apache.spark.graphx.GraphOps: Join the vertices with an RDD and then apply a function from the vertex and RDD entry to a new vertex value.
joinWith(Dataset, Column, String) - Method in class org.apache.spark.sql.Dataset: Joins this Dataset returning a Tuple2 for each pair where condition evaluates to true.
joinWith(Dataset, Column) - Method in class org.apache.spark.sql.Dataset: Using inner equi-join to join this Dataset returning a Tuple2 for each pair where condition evaluates to true.
json(String) - Method in class org.apache.spark.sql.DataFrameReader: Loads a JSON file (one object per line) and returns the result as a DataFrame.
json(Seq<String>) - Method in class org.apache.spark.sql.DataFrameReader: Loads a JSON file (one object per line) and returns the result as a DataFrame.
json(JavaRDD<String>) - Method in class org.apache.spark.sql.DataFrameReader: Loads an JavaRDD[String] storing JSON objects (one object per record) and returns the result as a DataFrame.
json(RDD<String>) - Method in class org.apache.spark.sql.DataFrameReader: Loads an RDD[String] storing JSON objects (one object per record) and returns the result as a DataFrame.
json(String) - Method in class org.apache.spark.sql.DataFrameWriter: Saves the content of the DataFrame in JSON format at the specified path.
json() - Method in class org.apache.spark.sql.types.DataType: The compact JSON representation of this data type.
json() - Method in class org.apache.spark.sql.types.Metadata: Converts to its JSON representation.
json_tuple(Column, String...) - Static method in class org.apache.spark.sql.functions: Creates a new row for a json column according to the given field names.
json_tuple(Column, Seq<String>) - Static method in class org.apache.spark.sql.functions: Creates a new row for a json column according to the given field names.
jsonDecode(String) - Method in class org.apache.spark.ml.param.BooleanParam
jsonDecode(String) - Method in class org.apache.spark.ml.param.DoubleArrayParam
jsonDecode(String) - Method in class org.apache.spark.ml.param.DoubleParam
jsonDecode(String) - Method in class org.apache.spark.ml.param.FloatParam
jsonDecode(String) - Method in class org.apache.spark.ml.param.IntArrayParam
jsonDecode(String) - Method in class org.apache.spark.ml.param.IntParam
jsonDecode(String) - Method in class org.apache.spark.ml.param.LongParam
jsonDecode(String) - Method in class org.apache.spark.ml.param.Param: Decodes a param value from JSON.
jsonDecode(String) - Method in class org.apache.spark.ml.param.StringArrayParam
jsonEncode(boolean) - Method in class org.apache.spark.ml.param.BooleanParam
jsonEncode(double[]) - Method in class org.apache.spark.ml.param.DoubleArrayParam
jsonEncode(double) - Method in class org.apache.spark.ml.param.DoubleParam
jsonEncode(float) - Method in class org.apache.spark.ml.param.FloatParam
jsonEncode(int[]) - Method in class org.apache.spark.ml.param.IntArrayParam
jsonEncode(int) - Method in class org.apache.spark.ml.param.IntParam
jsonEncode(long) - Method in class org.apache.spark.ml.param.LongParam
jsonEncode(T) - Method in class org.apache.spark.ml.param.Param: Encodes a param value into JSON, which can be decoded by jsonDecode().
jsonEncode(String[]) - Method in class org.apache.spark.ml.param.StringArrayParam
jsonFile(String) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
As of 1.4.0, replaced by read().json(). This will be removed in Spark 2.0.
jsonFile(String, StructType) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
As of 1.4.0, replaced by read().json(). This will be removed in Spark 2.0.
jsonFile(String, double) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
As of 1.4.0, replaced by read().json(). This will be removed in Spark 2.0.
jsonRDD(RDD<String>) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
As of 1.4.0, replaced by read().json(). This will be removed in Spark 2.0.
jsonRDD(JavaRDD<String>) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
As of 1.4.0, replaced by read().json(). This will be removed in Spark 2.0.
jsonRDD(RDD<String>, StructType) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
As of 1.4.0, replaced by read().json(). This will be removed in Spark 2.0.
jsonRDD(JavaRDD<String>, StructType) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
As of 1.4.0, replaced by read().json(). This will be removed in Spark 2.0.
jsonRDD(RDD<String>, double) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
As of 1.4.0, replaced by read().json(). This will be removed in Spark 2.0.
jsonRDD(JavaRDD<String>, double) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
As of 1.4.0, replaced by read().json(). This will be removed in Spark 2.0.
jValueDecode(JsonAST.JValue) - Static method in class org.apache.spark.ml.param.DoubleParam: Decodes a param value from JValue.
jValueDecode(JsonAST.JValue) - Static method in class org.apache.spark.ml.param.FloatParam: Decodes a param value from JValue.
jValueEncode(double) - Static method in class org.apache.spark.ml.param.DoubleParam: Encodes a param value into JValue.
jValueEncode(float) - Static method in class org.apache.spark.ml.param.FloatParam: Encodes a param value into JValue.
jvmGcTime() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
jvmGcTime() - Method in class org.apache.spark.status.api.v1.TaskMetrics
jvmInformation() - Method in class org.apache.spark.ui.env.EnvironmentListener

K

k() - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel: Number of leaf clusters.
k() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
k() - Method in class org.apache.spark.mllib.clustering.EMLDAOptimizer
k() - Method in class org.apache.spark.mllib.clustering.ExpectationSum
k() - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel: Number of gaussians in mixture
k() - Method in class org.apache.spark.mllib.clustering.KMeansModel: Total number of clusters.
k() - Method in class org.apache.spark.mllib.clustering.LDAModel: Number of topics
k() - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
k() - Method in class org.apache.spark.mllib.clustering.PowerIterationClusteringModel
k() - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
k() - Method in class org.apache.spark.mllib.feature.PCA
k() - Method in class org.apache.spark.mllib.feature.PCAModel
K_MEANS_PARALLEL() - Static method in class org.apache.spark.mllib.clustering.KMeans
KafkaUtils - Class in org.apache.spark.streaming.kafka
KafkaUtils() - Constructor for class org.apache.spark.streaming.kafka.KafkaUtils
kClassTag() - Method in class org.apache.spark.api.java.JavaHadoopRDD
kClassTag() - Method in class org.apache.spark.api.java.JavaNewHadoopRDD
kClassTag() - Method in class org.apache.spark.api.java.JavaPairRDD
kClassTag() - Method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
kClassTag() - Method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
KernelDensity - Class in org.apache.spark.mllib.stat: Kernel density estimation.
KernelDensity() - Constructor for class org.apache.spark.mllib.stat.KernelDensity
keyAs(Encoder<L>) - Method in class org.apache.spark.sql.GroupedDataset: Returns a new GroupedDataset where the type of the key has been mapped to the specified type.
keyBy(Function<T, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Creates tuples of the elements in this RDD by applying f.
keyBy(Function1<T, K>) - Method in class org.apache.spark.rdd.RDD: Creates tuples of the elements in this RDD by applying f.
keyClassName() - Method in class org.apache.spark.ShuffleDependency
keyOrdering() - Method in class org.apache.spark.ShuffleDependency
keys() - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD with the keys of each tuple.
keys() - Method in class org.apache.spark.rdd.PairRDDFunctions: Return an RDD with the keys of each tuple.
keys() - Method in class org.apache.spark.sql.GroupedDataset: Returns a Dataset that contains each unique key.
keyType() - Method in class org.apache.spark.sql.types.MapType
kFold(RDD<T>, int, int, ClassTag<T>) - Static method in class org.apache.spark.mllib.util.MLUtils: Return a k element array of pairs of RDDs with the first element of each pair containing the training data, a complement of the validation data and the second element, the validation data, containing a unique 1/kth of the data.
kill() - Method in interface org.apache.spark.launcher.SparkAppHandle: Tries to kill the underlying application.
killExecutor(String) - Method in class org.apache.spark.SparkContext: :: DeveloperApi :: Request that the cluster manager kill the specified executor.
killExecutors(Seq<String>) - Method in class org.apache.spark.SparkContext: :: DeveloperApi :: Request that the cluster manager kill the specified executors.
KillTask - Class in org.apache.spark.scheduler.local
KillTask(long, boolean) - Constructor for class org.apache.spark.scheduler.local.KillTask
KinesisUtils - Class in org.apache.spark.streaming.kinesis
KinesisUtils() - Constructor for class org.apache.spark.streaming.kinesis.KinesisUtils
KinesisUtilsPythonHelper - Class in org.apache.spark.streaming.kinesis: This is a helper class that wraps the methods in KinesisUtils into more Python-friendly class and function so that it can be easily instantiated and called from Python's KinesisUtils.
KinesisUtilsPythonHelper() - Constructor for class org.apache.spark.streaming.kinesis.KinesisUtilsPythonHelper
kManifest() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
KMeans - Class in org.apache.spark.ml.clustering
KMeans(String) - Constructor for class org.apache.spark.ml.clustering.KMeans
KMeans() - Constructor for class org.apache.spark.ml.clustering.KMeans
KMeans - Class in org.apache.spark.mllib.clustering: K-means clustering with support for multiple parallel runs and a k-means++ like initialization mode (the k-means|| algorithm by Bahmani et al).
KMeans() - Constructor for class org.apache.spark.mllib.clustering.KMeans: Constructs a KMeans instance with default parameters: {k: 2, maxIterations: 20, runs: 1, initializationMode: "k-means||", initializationSteps: 5, epsilon: 1e-4, seed: random}.
KMeansDataGenerator - Class in org.apache.spark.mllib.util: :: DeveloperApi :: Generate test data for KMeans.
KMeansDataGenerator() - Constructor for class org.apache.spark.mllib.util.KMeansDataGenerator
KMeansModel - Class in org.apache.spark.ml.clustering: :: Experimental :: Model fitted by KMeans.
KMeansModel - Class in org.apache.spark.mllib.clustering: A clustering model for K-means.
KMeansModel(Vector[]) - Constructor for class org.apache.spark.mllib.clustering.KMeansModel
KMeansModel(Iterable<Vector>) - Constructor for class org.apache.spark.mllib.clustering.KMeansModel: A Java-friendly constructor that takes an Iterable of Vectors.
kolmogorovSmirnovTest(RDD<Object>, String, double...) - Static method in class org.apache.spark.mllib.stat.Statistics: Convenience function to conduct a one-sample, two-sided Kolmogorov-Smirnov test for probability distribution equality.
kolmogorovSmirnovTest(JavaDoubleRDD, String, double...) - Static method in class org.apache.spark.mllib.stat.Statistics: Java-friendly version of kolmogorovSmirnovTest()
kolmogorovSmirnovTest(RDD<Object>, Function1<Object, Object>) - Static method in class org.apache.spark.mllib.stat.Statistics: Conduct the two-sided Kolmogorov-Smirnov (KS) test for data sampled from a continuous distribution.
kolmogorovSmirnovTest(RDD<Object>, String, Seq<Object>) - Static method in class org.apache.spark.mllib.stat.Statistics: Convenience function to conduct a one-sample, two-sided Kolmogorov-Smirnov test for probability distribution equality.
kolmogorovSmirnovTest(JavaDoubleRDD, String, Seq<Object>) - Static method in class org.apache.spark.mllib.stat.Statistics: Java-friendly version of kolmogorovSmirnovTest()
KolmogorovSmirnovTestResult - Class in org.apache.spark.mllib.stat.test: :: Experimental :: Object containing the test results for the Kolmogorov-Smirnov test.
kryo(ClassTag<T>) - Static method in class org.apache.spark.sql.Encoders: (Scala-specific) Creates an encoder that serializes objects of type T using Kryo.
kryo(Class<T>) - Static method in class org.apache.spark.sql.Encoders: Creates an encoder that serializes objects of type T using Kryo.
KryoRegistrator - Interface in org.apache.spark.serializer: Interface implemented by clients to register their classes with Kryo when using Kryo serialization.
KryoSerializer - Class in org.apache.spark.serializer: A Spark serializer that uses the Kryo serialization library.
KryoSerializer(SparkConf) - Constructor for class org.apache.spark.serializer.KryoSerializer
kurtosis(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the kurtosis of the values in a group.
kurtosis(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the kurtosis of the values in a group.

L

L1Updater - Class in org.apache.spark.mllib.optimization: :: DeveloperApi :: Updater for L1 regularized problems.
L1Updater() - Constructor for class org.apache.spark.mllib.optimization.L1Updater
label() - Method in class org.apache.spark.mllib.regression.LabeledPoint
labelCol() - Method in class org.apache.spark.ml.classification.BinaryLogisticRegressionSummary
labelCol() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary: Field in "predictions" which gives the true label of each instance.
labelCol() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary
LabelConverter - Class in org.apache.spark.ml.classification: Label to vector converter.
LabelConverter() - Constructor for class org.apache.spark.ml.classification.LabelConverter
LabeledPoint - Class in org.apache.spark.mllib.regression: Class that represents the features and labels of a data point.
LabeledPoint(double, Vector) - Constructor for class org.apache.spark.mllib.regression.LabeledPoint
LabelPropagation - Class in org.apache.spark.graphx.lib: Label Propagation algorithm.
LabelPropagation() - Constructor for class org.apache.spark.graphx.lib.LabelPropagation
labels() - Method in class org.apache.spark.ml.feature.IndexToString
labels() - Method in class org.apache.spark.ml.feature.StringIndexerModel
labels() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
labels() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns the sequence of labels in ascending order
labels() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics: Returns the sequence of labels in ascending order
lag(Column, int) - Static method in class org.apache.spark.sql.functions: Window function: returns the value that is offset rows before the current row, and null if there is less than offset rows before the current row.
lag(String, int) - Static method in class org.apache.spark.sql.functions: Window function: returns the value that is offset rows before the current row, and null if there is less than offset rows before the current row.
lag(String, int, Object) - Static method in class org.apache.spark.sql.functions: Window function: returns the value that is offset rows before the current row, and defaultValue if there is less than offset rows before the current row.
lag(Column, int, Object) - Static method in class org.apache.spark.sql.functions: Window function: returns the value that is offset rows before the current row, and defaultValue if there is less than offset rows before the current row.
LassoModel - Class in org.apache.spark.mllib.regression: Regression model trained using Lasso.
LassoModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.LassoModel
LassoWithSGD - Class in org.apache.spark.mllib.regression: Train a regression model with L1-regularization using Stochastic Gradient Descent.
LassoWithSGD() - Constructor for class org.apache.spark.mllib.regression.LassoWithSGD: Construct a Lasso object with default parameters: {stepSize: 1.0, numIterations: 100, regParam: 0.01, miniBatchFraction: 1.0}.
last(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the last value in a group.
last(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the last value of the column in a group.
last_day(Column) - Static method in class org.apache.spark.sql.functions: Given a date column, returns the last day of the month which the given date belongs to.
lastError() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
lastErrorMessage() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
lastErrorTime() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
lastValidTime() - Method in class org.apache.spark.streaming.dstream.InputDStream
latestModel() - Method in class org.apache.spark.mllib.clustering.StreamingKMeans: Return the latest model.
latestModel() - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm: Return the latest model.
launch() - Method in class org.apache.spark.launcher.SparkLauncher: Launches a sub-process that will start the configured Spark application.
launchTime() - Method in class org.apache.spark.scheduler.TaskInfo
launchTime() - Method in class org.apache.spark.status.api.v1.TaskData
layers() - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassificationModel
LBFGS - Class in org.apache.spark.mllib.optimization: :: DeveloperApi :: Class used to solve an optimization problem using Limited-memory BFGS.
LBFGS(Gradient, Updater) - Constructor for class org.apache.spark.mllib.optimization.LBFGS
LDA - Class in org.apache.spark.ml.clustering: :: Experimental ::
LDA(String) - Constructor for class org.apache.spark.ml.clustering.LDA
LDA() - Constructor for class org.apache.spark.ml.clustering.LDA
LDA - Class in org.apache.spark.mllib.clustering: Latent Dirichlet Allocation (LDA), a topic model designed for text documents.
LDA() - Constructor for class org.apache.spark.mllib.clustering.LDA: Constructs a LDA instance with default parameters.
LDAModel - Class in org.apache.spark.ml.clustering: :: Experimental :: Model fitted by LDA.
LDAModel - Class in org.apache.spark.mllib.clustering: Latent Dirichlet Allocation (LDA) model.
LDAOptimizer - Interface in org.apache.spark.mllib.clustering: :: DeveloperApi ::
lead(String, int) - Static method in class org.apache.spark.sql.functions: Window function: returns the value that is offset rows after the current row, and null if there is less than offset rows after the current row.
lead(Column, int) - Static method in class org.apache.spark.sql.functions: Window function: returns the value that is offset rows after the current row, and null if there is less than offset rows after the current row.
lead(String, int, Object) - Static method in class org.apache.spark.sql.functions: Window function: returns the value that is offset rows after the current row, and defaultValue if there is less than offset rows after the current row.
lead(Column, int, Object) - Static method in class org.apache.spark.sql.functions: Window function: returns the value that is offset rows after the current row, and defaultValue if there is less than offset rows after the current row.
LeafNode - Class in org.apache.spark.ml.tree: :: DeveloperApi :: Decision tree leaf node.
learningRate() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
least(Column...) - Static method in class org.apache.spark.sql.functions: Returns the least value of the list of values, skipping null values.
least(String, String...) - Static method in class org.apache.spark.sql.functions: Returns the least value of the list of column names, skipping null values.
least(Seq<Column>) - Static method in class org.apache.spark.sql.functions: Returns the least value of the list of values, skipping null values.
least(String, Seq<String>) - Static method in class org.apache.spark.sql.functions: Returns the least value of the list of column names, skipping null values.
LeastSquaresAggregator - Class in org.apache.spark.ml.regression: LeastSquaresAggregator computes the gradient and loss for a Least-squared loss function, as used in linear regression for samples in sparse or dense vector in a online fashion.
LeastSquaresAggregator(Vector, double, double, boolean, double[], double[]) - Constructor for class org.apache.spark.ml.regression.LeastSquaresAggregator
LeastSquaresCostFun - Class in org.apache.spark.ml.regression: LeastSquaresCostFun implements Breeze's DiffFunction[T] for Least Squares cost.
LeastSquaresCostFun(RDD<org.apache.spark.ml.feature.Instance>, double, double, boolean, boolean, double[], double[], double) - Constructor for class org.apache.spark.ml.regression.LeastSquaresCostFun
LeastSquaresGradient - Class in org.apache.spark.mllib.optimization: :: DeveloperApi :: Compute gradient and loss for a Least-squared loss function, as used in linear regression.
LeastSquaresGradient() - Constructor for class org.apache.spark.mllib.optimization.LeastSquaresGradient
left() - Method in class org.apache.spark.sql.sources.And
left() - Method in class org.apache.spark.sql.sources.Or
leftCategories() - Method in class org.apache.spark.ml.tree.CategoricalSplit: Get sorted categories which split to the left
leftChild() - Method in class org.apache.spark.ml.tree.InternalNode
leftChildIndex(int) - Static method in class org.apache.spark.mllib.tree.model.Node: Return the index of the left child of this node.
leftImpurity() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
leftJoin(RDD<Tuple2<Object, VD2>>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
leftJoin(RDD<Tuple2<Object, VD2>>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.VertexRDD: Left joins this VertexRDD with an RDD containing vertex attribute pairs.
leftNode() - Method in class org.apache.spark.mllib.tree.model.Node
leftOuterJoin(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Perform a left outer join of this and other.
leftOuterJoin(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD: Perform a left outer join of this and other.
leftOuterJoin(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Perform a left outer join of this and other.
leftOuterJoin(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions: Perform a left outer join of this and other.
leftOuterJoin(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Perform a left outer join of this and other.
leftOuterJoin(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: Perform a left outer join of this and other.
leftOuterJoin(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
leftOuterJoin(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
leftOuterJoin(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
leftOuterJoin(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
leftOuterJoin(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
leftOuterJoin(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'left outer join' between RDDs of this DStream and other DStream.
leftPredict() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
leftZipJoin(VertexRDD<VD2>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
leftZipJoin(VertexRDD<VD2>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.VertexRDD: Left joins this RDD with another VertexRDD with the same index.
LEGACY_DRIVER_IDENTIFIER() - Static method in class org.apache.spark.SparkContext: Legacy version of DRIVER_IDENTIFIER, retained for backwards-compatibility.
length() - Method in class org.apache.spark.scheduler.SplitInfo
length(Column) - Static method in class org.apache.spark.sql.functions: Computes the length of a given string or binary column.
length() - Method in interface org.apache.spark.sql.Row: Number of elements in the Row.
length() - Method in class org.apache.spark.sql.sources.HadoopFsRelation.FakeFileStatus
length() - Method in class org.apache.spark.sql.types.StructType
length() - Method in class org.apache.spark.util.Vector
leq(Object) - Method in class org.apache.spark.sql.Column: Less than or equal to.
less(Duration) - Method in class org.apache.spark.streaming.Duration
less(Time) - Method in class org.apache.spark.streaming.Time
lessEq(Duration) - Method in class org.apache.spark.streaming.Duration
lessEq(Time) - Method in class org.apache.spark.streaming.Time
LessThan - Class in org.apache.spark.sql.sources: A filter that evaluates to true iff the attribute evaluates to a value less than value.
LessThan(String, Object) - Constructor for class org.apache.spark.sql.sources.LessThan
LessThanOrEqual - Class in org.apache.spark.sql.sources: A filter that evaluates to true iff the attribute evaluates to a value less than or equal to value.
LessThanOrEqual(String, Object) - Constructor for class org.apache.spark.sql.sources.LessThanOrEqual
levenshtein(Column, Column) - Static method in class org.apache.spark.sql.functions: Computes the Levenshtein distance of the two given string columns.
like(String) - Method in class org.apache.spark.sql.Column: SQL like expression.
limit(int) - Method in class org.apache.spark.sql.DataFrame: Returns a new DataFrame by taking the first n rows.
line() - Method in exception org.apache.spark.sql.AnalysisException
LinearDataGenerator - Class in org.apache.spark.mllib.util: :: DeveloperApi :: Generate sample data used for Linear Data.
LinearDataGenerator() - Constructor for class org.apache.spark.mllib.util.LinearDataGenerator
LinearRegression - Class in org.apache.spark.ml.regression: :: Experimental :: Linear regression.
LinearRegression(String) - Constructor for class org.apache.spark.ml.regression.LinearRegression
LinearRegression() - Constructor for class org.apache.spark.ml.regression.LinearRegression
LinearRegressionModel - Class in org.apache.spark.ml.regression: :: Experimental :: Model produced by LinearRegression.
LinearRegressionModel - Class in org.apache.spark.mllib.regression: Regression model trained using LinearRegression.
LinearRegressionModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.LinearRegressionModel
LinearRegressionSummary - Class in org.apache.spark.ml.regression
LinearRegressionTrainingSummary - Class in org.apache.spark.ml.regression
LinearRegressionWithSGD - Class in org.apache.spark.mllib.regression: Train a linear regression model with no regularization using Stochastic Gradient Descent.
LinearRegressionWithSGD() - Constructor for class org.apache.spark.mllib.regression.LinearRegressionWithSGD: Construct a LinearRegression object with default parameters: {stepSize: 1.0, numIterations: 100, miniBatchFraction: 1.0}.
listener() - Method in class org.apache.spark.sql.SQLContext
listenerBus() - Method in class org.apache.spark.SparkContext
listenerManager() - Method in class org.apache.spark.sql.SQLContext
listLeafFiles(FileSystem, FileStatus) - Static method in class org.apache.spark.sql.sources.HadoopFsRelation
listLeafFilesInParallel(String[], Configuration, SparkContext) - Static method in class org.apache.spark.sql.sources.HadoopFsRelation
lit(Object) - Static method in class org.apache.spark.sql.functions: Creates a Column of literal value.
load(String) - Static method in class org.apache.spark.ml.classification.LogisticRegression
load(String) - Static method in class org.apache.spark.ml.classification.LogisticRegressionModel
load(String) - Static method in class org.apache.spark.ml.classification.NaiveBayes
load(String) - Static method in class org.apache.spark.ml.classification.NaiveBayesModel
load(String) - Static method in class org.apache.spark.ml.clustering.DistributedLDAModel
load(String) - Static method in class org.apache.spark.ml.clustering.KMeans
load(String) - Static method in class org.apache.spark.ml.clustering.KMeansModel
load(String) - Static method in class org.apache.spark.ml.clustering.LDA
load(String) - Static method in class org.apache.spark.ml.clustering.LocalLDAModel
load(String) - Static method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
load(String) - Static method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
load(String) - Static method in class org.apache.spark.ml.evaluation.RegressionEvaluator
load(String) - Static method in class org.apache.spark.ml.feature.Binarizer
load(String) - Static method in class org.apache.spark.ml.feature.Bucketizer
load(String) - Static method in class org.apache.spark.ml.feature.ChiSqSelector
load(String) - Static method in class org.apache.spark.ml.feature.ChiSqSelectorModel
load(String) - Static method in class org.apache.spark.ml.feature.CountVectorizer
load(String) - Static method in class org.apache.spark.ml.feature.CountVectorizerModel
load(String) - Static method in class org.apache.spark.ml.feature.DCT
load(String) - Static method in class org.apache.spark.ml.feature.HashingTF
load(String) - Static method in class org.apache.spark.ml.feature.IDF
load(String) - Static method in class org.apache.spark.ml.feature.IDFModel
load(String) - Static method in class org.apache.spark.ml.feature.IndexToString
load(String) - Static method in class org.apache.spark.ml.feature.Interaction
load(String) - Static method in class org.apache.spark.ml.feature.MinMaxScaler
load(String) - Static method in class org.apache.spark.ml.feature.MinMaxScalerModel
load(String) - Static method in class org.apache.spark.ml.feature.NGram
load(String) - Static method in class org.apache.spark.ml.feature.Normalizer
load(String) - Static method in class org.apache.spark.ml.feature.OneHotEncoder
load(String) - Static method in class org.apache.spark.ml.feature.PCA
load(String) - Static method in class org.apache.spark.ml.feature.PCAModel
load(String) - Static method in class org.apache.spark.ml.feature.PolynomialExpansion
load(String) - Static method in class org.apache.spark.ml.feature.QuantileDiscretizer
load(String) - Static method in class org.apache.spark.ml.feature.RegexTokenizer
load(String) - Static method in class org.apache.spark.ml.feature.SQLTransformer
load(String) - Static method in class org.apache.spark.ml.feature.StandardScaler
load(String) - Static method in class org.apache.spark.ml.feature.StandardScalerModel
load(String) - Static method in class org.apache.spark.ml.feature.StopWordsRemover
load(String) - Static method in class org.apache.spark.ml.feature.StringIndexer
load(String) - Static method in class org.apache.spark.ml.feature.StringIndexerModel
load(String) - Static method in class org.apache.spark.ml.feature.Tokenizer
load(String) - Static method in class org.apache.spark.ml.feature.VectorAssembler
load(String) - Static method in class org.apache.spark.ml.feature.VectorIndexer
load(String) - Static method in class org.apache.spark.ml.feature.VectorIndexerModel
load(String) - Static method in class org.apache.spark.ml.feature.VectorSlicer
load(String) - Static method in class org.apache.spark.ml.feature.Word2Vec
load(String) - Static method in class org.apache.spark.ml.feature.Word2VecModel
load(String) - Static method in class org.apache.spark.ml.Pipeline
load(String) - Static method in class org.apache.spark.ml.PipelineModel
load(String) - Static method in class org.apache.spark.ml.recommendation.ALS
load(String) - Static method in class org.apache.spark.ml.recommendation.ALSModel
load(String) - Static method in class org.apache.spark.ml.regression.AFTSurvivalRegression
load(String) - Static method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
load(String) - Static method in class org.apache.spark.ml.regression.IsotonicRegression
load(String) - Static method in class org.apache.spark.ml.regression.IsotonicRegressionModel
load(String) - Static method in class org.apache.spark.ml.regression.LinearRegression
load(String) - Static method in class org.apache.spark.ml.regression.LinearRegressionModel
load(String) - Static method in class org.apache.spark.ml.tuning.CrossValidator
load(String) - Static method in class org.apache.spark.ml.tuning.CrossValidatorModel
load(String) - Method in interface org.apache.spark.ml.util.MLReadable: Reads an ML instance from the input path, a shortcut of read.load(path).
load(String) - Method in class org.apache.spark.ml.util.MLReader: Loads the ML component from the input path.
load(SparkContext, String) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionModel
load(SparkContext, String) - Static method in class org.apache.spark.mllib.classification.NaiveBayesModel
load(SparkContext, String) - Static method in class org.apache.spark.mllib.classification.SVMModel
load(SparkContext, String) - Static method in class org.apache.spark.mllib.clustering.DistributedLDAModel
load(SparkContext, String) - Static method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
load(SparkContext, String) - Static method in class org.apache.spark.mllib.clustering.KMeansModel
load(SparkContext, String) - Static method in class org.apache.spark.mllib.clustering.LocalLDAModel
load(SparkContext, String) - Static method in class org.apache.spark.mllib.clustering.PowerIterationClusteringModel
load(SparkContext, String) - Static method in class org.apache.spark.mllib.feature.ChiSqSelectorModel
load(SparkContext, String) - Static method in class org.apache.spark.mllib.feature.Word2VecModel
load(SparkContext, String) - Static method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel: Load a model from the given path.
load(SparkContext, String) - Static method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
load(SparkContext, String) - Static method in class org.apache.spark.mllib.regression.LassoModel
load(SparkContext, String) - Static method in class org.apache.spark.mllib.regression.LinearRegressionModel
load(SparkContext, String) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionModel
load(SparkContext, String) - Static method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
load(SparkContext, String) - Static method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
load(SparkContext, String) - Static method in class org.apache.spark.mllib.tree.model.RandomForestModel
load(SparkContext, String) - Method in interface org.apache.spark.mllib.util.Loader: Load a model from the given path.
load(String...) - Method in class org.apache.spark.sql.DataFrameReader: Loads input in as a DataFrame, for data sources that support multiple paths.
load(String) - Method in class org.apache.spark.sql.DataFrameReader: Loads input in as a DataFrame, for data sources that require a path (e.g.
load() - Method in class org.apache.spark.sql.DataFrameReader: Loads input in as a DataFrame, for data sources that don't require a path (e.g.
load(Seq<String>) - Method in class org.apache.spark.sql.DataFrameReader: Loads input in as a DataFrame, for data sources that support multiple paths.
load(String) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
As of 1.4.0, replaced by read().load(path). This will be removed in Spark 2.0.
load(String, String) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
As of 1.4.0, replaced by read().format(source).load(path). This will be removed in Spark 2.0.
load(String, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
As of 1.4.0, replaced by read().format(source).options(options).load(). This will be removed in Spark 2.0.
load(String, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
As of 1.4.0, replaced by read().format(source).options(options).load().
load(String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
As of 1.4.0, replaced by read().format(source).schema(schema).options(options).load().
load(String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
As of 1.4.0, replaced by read().format(source).schema(schema).options(options).load().
Loader<M extends Saveable> - Interface in org.apache.spark.mllib.util: :: DeveloperApi ::
loadLabeledData(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils: Deprecated.
Should use RDD.saveAsTextFile(java.lang.String) for saving and MLUtils.loadLabeledPoints(org.apache.spark.SparkContext, java.lang.String, int) for loading.
loadLabeledPoints(SparkContext, String, int) - Static method in class org.apache.spark.mllib.util.MLUtils: Loads labeled points saved using RDD[LabeledPoint].saveAsTextFile.
loadLabeledPoints(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils: Loads labeled points saved using RDD[LabeledPoint].saveAsTextFile with the default number of partitions.
loadLibSVMFile(SparkContext, String, int, int) - Static method in class org.apache.spark.mllib.util.MLUtils: Loads labeled data in the LIBSVM format into an RDD[LabeledPoint].
loadLibSVMFile(SparkContext, String, boolean, int, int) - Static method in class org.apache.spark.mllib.util.MLUtils
loadLibSVMFile(SparkContext, String, int) - Static method in class org.apache.spark.mllib.util.MLUtils: Loads labeled data in the LIBSVM format into an RDD[LabeledPoint], with the default number of partitions.
loadLibSVMFile(SparkContext, String, boolean, int) - Static method in class org.apache.spark.mllib.util.MLUtils
loadLibSVMFile(SparkContext, String, boolean) - Static method in class org.apache.spark.mllib.util.MLUtils
loadLibSVMFile(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils: Loads binary labeled data in the LIBSVM format into an RDD[LabeledPoint], with number of features determined automatically and the default number of partitions.
loadVectors(SparkContext, String, int) - Static method in class org.apache.spark.mllib.util.MLUtils: Loads vectors saved using RDD[Vector].saveAsTextFile.
loadVectors(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils: Loads vectors saved using RDD[Vector].saveAsTextFile with the default number of partitions.
LOCAL_CLUSTER_REGEX() - Static method in class org.apache.spark.SparkMasterRegex
LOCAL_N_FAILURES_REGEX() - Static method in class org.apache.spark.SparkMasterRegex
LOCAL_N_REGEX() - Static method in class org.apache.spark.SparkMasterRegex
localBlocksFetched() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetricDistributions
localBlocksFetched() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetrics
localCheckpoint() - Method in class org.apache.spark.rdd.RDD: Mark this RDD for local checkpointing using Spark's existing caching layer.
LocalLDAModel - Class in org.apache.spark.ml.clustering: :: Experimental ::
LocalLDAModel - Class in org.apache.spark.mllib.clustering: Local LDA model.
localProperties() - Method in class org.apache.spark.SparkContext
localSeqToDataFrameHolder(Seq<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLImplicits: Creates a DataFrame from a local Seq of Product.
localSeqToDatasetHolder(Seq<T>, Encoder<T>) - Method in class org.apache.spark.sql.SQLImplicits: Creates a Dataset from a local Seq.
localValue() - Method in class org.apache.spark.Accumulable: Get the current value of this accumulator from within a task.
locate(String, Column) - Static method in class org.apache.spark.sql.functions: Locate the position of the first occurrence of substr.
locate(String, Column, int) - Static method in class org.apache.spark.sql.functions: Locate the position of the first occurrence of substr in a string column, after position pos.
location() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
log() - Method in interface org.apache.spark.Logging
log(Column) - Static method in class org.apache.spark.sql.functions: Computes the natural logarithm of the given value.
log(String) - Static method in class org.apache.spark.sql.functions: Computes the natural logarithm of the given column.
log(double, Column) - Static method in class org.apache.spark.sql.functions: Returns the first argument-base logarithm of the second argument.
log(double, String) - Static method in class org.apache.spark.sql.functions: Returns the first argument-base logarithm of the second argument.
log10(Column) - Static method in class org.apache.spark.sql.functions: Computes the logarithm of the given value in base 10.
log10(String) - Static method in class org.apache.spark.sql.functions: Computes the logarithm of the given value in base 10.
log1p(Column) - Static method in class org.apache.spark.sql.functions: Computes the natural logarithm of the given value plus one.
log1p(String) - Static method in class org.apache.spark.sql.functions: Computes the natural logarithm of the given column plus one.
log2(Column) - Static method in class org.apache.spark.sql.functions: Computes the logarithm of the given column in base 2.
log2(String) - Static method in class org.apache.spark.sql.functions: Computes the logarithm of the given value in base 2.
log_() - Method in interface org.apache.spark.Logging
logDebug(Function0<String>) - Method in interface org.apache.spark.Logging
logDebug(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
logDeprecationWarning(String) - Static method in class org.apache.spark.SparkConf: Logs a warning message if the given config key is deprecated.
logDirName() - Method in class org.apache.spark.scheduler.JobLogger
logError(Function0<String>) - Method in interface org.apache.spark.Logging
logError(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
Logging - Interface in org.apache.spark: Utility trait for classes that want to log data.
logicalPlan() - Method in class org.apache.spark.sql.DataFrame
logInfo(Function0<String>) - Method in interface org.apache.spark.Logging
logInfo(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
LogisticAggregator - Class in org.apache.spark.ml.classification: LogisticAggregator computes the gradient and loss for binary logistic loss function, as used in binary classification for instances in sparse or dense vector in a online fashion.
LogisticAggregator(Vector, int, boolean, double[], double[]) - Constructor for class org.apache.spark.ml.classification.LogisticAggregator
LogisticCostFun - Class in org.apache.spark.ml.classification: LogisticCostFun implements Breeze's DiffFunction[T] for a multinomial logistic loss function, as used in multi-class classification (it is also used in binary logistic regression).
LogisticCostFun(RDD<org.apache.spark.ml.feature.Instance>, int, boolean, boolean, double[], double[], double) - Constructor for class org.apache.spark.ml.classification.LogisticCostFun
LogisticGradient - Class in org.apache.spark.mllib.optimization: :: DeveloperApi :: Compute gradient and loss for a multinomial logistic loss function, as used in multi-class classification (it is also used in binary logistic regression).
LogisticGradient(int) - Constructor for class org.apache.spark.mllib.optimization.LogisticGradient
LogisticGradient() - Constructor for class org.apache.spark.mllib.optimization.LogisticGradient
LogisticRegression - Class in org.apache.spark.ml.classification: :: Experimental :: Logistic regression.
LogisticRegression(String) - Constructor for class org.apache.spark.ml.classification.LogisticRegression
LogisticRegression() - Constructor for class org.apache.spark.ml.classification.LogisticRegression
LogisticRegressionDataGenerator - Class in org.apache.spark.mllib.util: :: DeveloperApi :: Generate test data for LogisticRegression.
LogisticRegressionDataGenerator() - Constructor for class org.apache.spark.mllib.util.LogisticRegressionDataGenerator
LogisticRegressionModel - Class in org.apache.spark.ml.classification: :: Experimental :: Model produced by LogisticRegression.
LogisticRegressionModel - Class in org.apache.spark.mllib.classification: Classification model trained using Multinomial/Binary Logistic Regression.
LogisticRegressionModel(Vector, double, int, int) - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionModel
LogisticRegressionModel(Vector, double) - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionModel: Constructs a LogisticRegressionModel with weights and intercept for binary classification.
LogisticRegressionSummary - Interface in org.apache.spark.ml.classification: Abstraction for Logistic Regression Results for a given model.
LogisticRegressionTrainingSummary - Interface in org.apache.spark.ml.classification: Abstraction for multinomial Logistic Regression Training results.
LogisticRegressionWithLBFGS - Class in org.apache.spark.mllib.classification: Train a classification model for Multinomial/Binary Logistic Regression using Limited-memory BFGS.
LogisticRegressionWithLBFGS() - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionWithLBFGS
LogisticRegressionWithSGD - Class in org.apache.spark.mllib.classification: Train a classification model for Binary Logistic Regression using Stochastic Gradient Descent.
LogisticRegressionWithSGD() - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionWithSGD: Construct a LogisticRegression object with default parameters: {stepSize: 1.0, numIterations: 100, regParm: 0.01, miniBatchFraction: 1.0}.
logLikelihood(DataFrame) - Method in class org.apache.spark.ml.clustering.LDAModel: Calculates a lower bound on the log likelihood of the entire corpus.
logLikelihood() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel: Log likelihood of the observed tokens in the training set, given the current parameter estimates: log P(docs | topics, topic distributions for docs, alpha, eta)
logLikelihood() - Method in class org.apache.spark.mllib.clustering.ExpectationSum
logLikelihood(RDD<Tuple2<Object, Vector>>) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel: Calculates a lower bound on the log likelihood of the entire corpus.
logLikelihood(JavaPairRDD<Long, Vector>) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel: Java-friendly version of logLikelihood
LogLoss - Class in org.apache.spark.mllib.tree.loss: :: DeveloperApi :: Class for log loss calculation (for classification).
LogLoss() - Constructor for class org.apache.spark.mllib.tree.loss.LogLoss
logName() - Method in interface org.apache.spark.Logging
LogNormalGenerator - Class in org.apache.spark.mllib.random: :: DeveloperApi :: Generates i.i.d.
LogNormalGenerator(double, double) - Constructor for class org.apache.spark.mllib.random.LogNormalGenerator
logNormalGraph(SparkContext, int, int, double, double, long) - Static method in class org.apache.spark.graphx.util.GraphGenerators: Generate a graph whose vertex out degree distribution is log normal.
logNormalJavaRDD(JavaSparkContext, double, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Java-friendly version of RandomRDDs.logNormalRDD(org.apache.spark.SparkContext, double, double, long, int, long).
logNormalJavaRDD(JavaSparkContext, double, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.logNormalJavaRDD(org.apache.spark.api.java.JavaSparkContext, double, double, long, int, long) with the default seed.
logNormalJavaRDD(JavaSparkContext, double, double, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.logNormalJavaRDD(org.apache.spark.api.java.JavaSparkContext, double, double, long, int, long) with the default number of partitions and the default seed.
logNormalJavaVectorRDD(JavaSparkContext, double, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Java-friendly version of RandomRDDs.logNormalVectorRDD(org.apache.spark.SparkContext, double, double, long, int, int, long).
logNormalJavaVectorRDD(JavaSparkContext, double, double, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.logNormalJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, double, double, long, int, int, long) with the default seed.
logNormalJavaVectorRDD(JavaSparkContext, double, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.logNormalJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, double, double, long, int, int, long) with the default number of partitions and the default seed.
logNormalRDD(SparkContext, double, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Generates an RDD comprised of i.i.d. samples from the log normal distribution with the input mean and standard deviation
logNormalVectorRDD(SparkContext, double, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Generates an RDD[Vector] with vectors containing i.i.d. samples drawn from a log normal distribution.
logpdf(Vector) - Method in class org.apache.spark.mllib.stat.distribution.MultivariateGaussian: Returns the log-density of this multivariate Gaussian at given point, x
logPerplexity(DataFrame) - Method in class org.apache.spark.ml.clustering.LDAModel: Calculate an upper bound bound on perplexity.
logPerplexity(RDD<Tuple2<Object, Vector>>) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel: Calculate an upper bound bound on perplexity.
logPerplexity(JavaPairRDD<Long, Vector>) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel: Java-friendly version of logPerplexity
logPrior() - Method in class org.apache.spark.ml.clustering.DistributedLDAModel: Log probability of the current parameter estimate: log P(topics, topic distributions for docs | Dirichlet hyperparameters)
logPrior() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel: Log probability of the current parameter estimate: log P(topics, topic distributions for docs | alpha, eta)
logTrace(Function0<String>) - Method in interface org.apache.spark.Logging
logTrace(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
logUrlMap() - Method in class org.apache.spark.scheduler.cluster.ExecutorInfo
logWarning(Function0<String>) - Method in interface org.apache.spark.Logging
logWarning(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
LONG() - Static method in class org.apache.spark.sql.Encoders: An encoder for nullable long type.
LongDecimal() - Static method in class org.apache.spark.sql.types.DecimalType
LongParam - Class in org.apache.spark.ml.param: :: DeveloperApi :: Specialized version of Param[Long] for Java.
LongParam(String, String, String, Function1<Object, Object>) - Constructor for class org.apache.spark.ml.param.LongParam
LongParam(String, String, String) - Constructor for class org.apache.spark.ml.param.LongParam
LongParam(Identifiable, String, String, Function1<Object, Object>) - Constructor for class org.apache.spark.ml.param.LongParam
LongParam(Identifiable, String, String) - Constructor for class org.apache.spark.ml.param.LongParam
longRddToDataFrameHolder(RDD<Object>) - Method in class org.apache.spark.sql.SQLImplicits: Creates a single column DataFrame from an RDD[Long].
longToLongWritable(long) - Static method in class org.apache.spark.SparkContext
LongType - Static variable in class org.apache.spark.sql.types.DataTypes: Gets the LongType object.
LongType - Class in org.apache.spark.sql.types: :: DeveloperApi :: The data type representing Long values.
longWritableConverter() - Static method in class org.apache.spark.SparkContext
lookup(K) - Method in class org.apache.spark.api.java.JavaPairRDD: Return the list of values in the RDD for key key.
lookup(K) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return the list of values in the RDD for key key.
lookupTimeout(SparkConf) - Static method in class org.apache.spark.util.RpcUtils
loss() - Method in class org.apache.spark.ml.classification.LogisticAggregator
loss() - Method in class org.apache.spark.ml.regression.AFTAggregator
loss() - Method in class org.apache.spark.ml.regression.LeastSquaresAggregator
loss() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
Loss - Interface in org.apache.spark.mllib.tree.loss: :: DeveloperApi :: Trait for adding "pluggable" loss functions for the gradient boosting algorithm.
Losses - Class in org.apache.spark.mllib.tree.loss
Losses() - Constructor for class org.apache.spark.mllib.tree.loss.Losses
lossType() - Method in class org.apache.spark.ml.classification.GBTClassifier: Loss function which GBT tries to minimize.
lossType() - Method in class org.apache.spark.ml.regression.GBTRegressor: Loss function which GBT tries to minimize.
low() - Method in class org.apache.spark.partial.BoundedDouble
lower(Column) - Static method in class org.apache.spark.sql.functions: Converts a string column to lower case.
lpad(Column, int, String) - Static method in class org.apache.spark.sql.functions: Left-pad the string column with
lt(double) - Static method in class org.apache.spark.ml.param.ParamValidators: Check if value < upperBound
lt(Object) - Method in class org.apache.spark.sql.Column: Less than.
ltEq(double) - Static method in class org.apache.spark.ml.param.ParamValidators: Check if value <= upperBound
ltrim(Column) - Static method in class org.apache.spark.sql.functions: Trim the spaces from left end for the specified string value.
LZ4CompressionCodec - Class in org.apache.spark.io: :: DeveloperApi :: LZ4 implementation of CompressionCodec.
LZ4CompressionCodec(SparkConf) - Constructor for class org.apache.spark.io.LZ4CompressionCodec
LZFCompressionCodec - Class in org.apache.spark.io: :: DeveloperApi :: LZF implementation of CompressionCodec.
LZFCompressionCodec(SparkConf) - Constructor for class org.apache.spark.io.LZFCompressionCodec

M

main(String[]) - Static method in class org.apache.spark.mllib.util.KMeansDataGenerator
main(String[]) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
main(String[]) - Static method in class org.apache.spark.mllib.util.LogisticRegressionDataGenerator
main(String[]) - Static method in class org.apache.spark.mllib.util.MFDataGenerator
main(String[]) - Static method in class org.apache.spark.mllib.util.SVMDataGenerator
makeDriverRef(String, SparkConf, org.apache.spark.rpc.RpcEnv) - Static method in class org.apache.spark.util.RpcUtils: Retrieve a RpcEndpointRef which is located in the driver via its name.
makeRDD(Seq<T>, int, ClassTag<T>) - Method in class org.apache.spark.SparkContext: Distribute a local Scala collection to form an RDD.
makeRDD(Seq<Tuple2<T, Seq<String>>>, ClassTag<T>) - Method in class org.apache.spark.SparkContext: Distribute a local Scala collection to form an RDD, with one or more location preferences (hostnames of Spark nodes) for each object.
map(Function<T, R>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to all elements of this RDD.
map(Function1<Object, Object>) - Method in interface org.apache.spark.mllib.linalg.Matrix: Map the values of this matrix using a function.
map(Function1<R, T>) - Method in class org.apache.spark.partial.PartialResult: Transform this PartialResult into a PartialResult of type T.
map(Function1<T, U>, ClassTag) - Method in class org.apache.spark.rdd.RDD: Return a new RDD by applying a function to all elements of this RDD.
map(DataType, DataType) - Method in class org.apache.spark.sql.ColumnName: Creates a new StructField of type map.
map(MapType) - Method in class org.apache.spark.sql.ColumnName
map(Function1<Row, R>, ClassTag<R>) - Method in class org.apache.spark.sql.DataFrame: Returns a new RDD by applying a function to all rows of this DataFrame.
map(Function1<T, U>, Encoder) - Method in class org.apache.spark.sql.Dataset: (Scala-specific) Returns a new Dataset that contains the result of applying func to each element.
map(MapFunction<T, U>, Encoder) - Method in class org.apache.spark.sql.Dataset: (Java-specific) Returns a new Dataset that contains the result of applying func to each element.
map() - Method in class org.apache.spark.sql.types.Metadata
map(Function<T, R>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream by applying a function to all elements of this DStream.
map(Function1<T, U>, ClassTag) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream by applying a function to all elements of this DStream.
mapEdgePartitions(Function2<Object, EdgePartition<ED, VD>, EdgePartition<ED2, VD2>>, ClassTag<ED2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
mapEdges(Function1<Edge<ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph: Transforms each edge attribute in the graph using the map function.
mapEdges(Function2<Object, Iterator<Edge<ED>>, Iterator<ED2>>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph: Transforms each edge attribute using the map function, passing it a whole partition at a time.
mapEdges(Function2<Object, Iterator<Edge<ED>>, Iterator<ED2>>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
MapFunction<T,U> - Interface in org.apache.spark.api.java.function: Base interface for a map function used in Dataset's map function.
mapGroups(Function2<K, Iterator<V>, U>, Encoder) - Method in class org.apache.spark.sql.GroupedDataset: Applies the given function to each group of data.
mapGroups(MapGroupsFunction<K, V, U>, Encoder) - Method in class org.apache.spark.sql.GroupedDataset: Applies the given function to each group of data.
MapGroupsFunction<K,V,R> - Interface in org.apache.spark.api.java.function: Base interface for a map function used in GroupedDataset's mapGroup function.
mapId() - Method in class org.apache.spark.FetchFailed
mapId() - Method in class org.apache.spark.storage.ShuffleBlockId
mapId() - Method in class org.apache.spark.storage.ShuffleDataBlockId
mapId() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
mapOutputTracker() - Method in class org.apache.spark.SparkEnv
mapPartitions(FlatMapFunction<Iterator<T>, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to each partition of this RDD.
mapPartitions(FlatMapFunction<Iterator<T>, U>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to each partition of this RDD.
mapPartitions(Function1<Iterator<T>, Iterator>, boolean, ClassTag) - Method in class org.apache.spark.rdd.RDD: Return a new RDD by applying a function to each partition of this RDD.
mapPartitions(Function1<Iterator<Row>, Iterator<R>>, ClassTag<R>) - Method in class org.apache.spark.sql.DataFrame: Returns a new RDD by applying a function to each partition of this DataFrame.
mapPartitions(Function1<Iterator<T>, Iterator>, Encoder) - Method in class org.apache.spark.sql.Dataset: (Scala-specific) Returns a new Dataset that contains the result of applying func to each partition.
mapPartitions(MapPartitionsFunction<T, U>, Encoder) - Method in class org.apache.spark.sql.Dataset: (Java-specific) Returns a new Dataset that contains the result of applying func to each partition.
mapPartitions(FlatMapFunction<Iterator<T>, U>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs of this DStream.
mapPartitions(Function1<Iterator<T>, Iterator>, boolean, ClassTag) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs of this DStream.
MapPartitionsFunction<T,U> - Interface in org.apache.spark.api.java.function: Base interface for function used in Dataset's mapPartitions.
mapPartitionsToDouble(DoubleFlatMapFunction<Iterator<T>>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to each partition of this RDD.
mapPartitionsToDouble(DoubleFlatMapFunction<Iterator<T>>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to each partition of this RDD.
mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to each partition of this RDD.
mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to each partition of this RDD.
mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs of this DStream.
mapPartitionsWithContext(Function2<TaskContext, Iterator<T>, Iterator>, boolean, ClassTag) - Method in class org.apache.spark.rdd.RDD: :: DeveloperApi :: Return a new RDD by applying a function to each partition of this RDD.
mapPartitionsWithIndex(Function2<Integer, Iterator<T>, Iterator<R>>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to each partition of this RDD, while tracking the index of the original partition.
mapPartitionsWithIndex(Function2<Object, Iterator<T>, Iterator>, boolean, ClassTag) - Method in class org.apache.spark.rdd.RDD: Return a new RDD by applying a function to each partition of this RDD, while tracking the index of the original partition.
mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator<R>>, boolean) - Method in class org.apache.spark.api.java.JavaHadoopRDD: Maps over a partition, providing the InputSplit that was used as the base of the partition.
mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator<R>>, boolean) - Method in class org.apache.spark.api.java.JavaNewHadoopRDD: Maps over a partition, providing the InputSplit that was used as the base of the partition.
mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator>, boolean, ClassTag) - Method in class org.apache.spark.rdd.HadoopRDD: Maps over a partition, providing the InputSplit that was used as the base of the partition.
mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator>, boolean, ClassTag) - Method in class org.apache.spark.rdd.NewHadoopRDD: Maps over a partition, providing the InputSplit that was used as the base of the partition.
mapPartitionsWithSplit(Function2<Object, Iterator<T>, Iterator>, boolean, ClassTag) - Method in class org.apache.spark.rdd.RDD: Return a new RDD by applying a function to each partition of this RDD, while tracking the index of the original partition.
mapredInputFormat() - Method in class org.apache.spark.scheduler.InputFormatInfo
mapreduceInputFormat() - Method in class org.apache.spark.scheduler.InputFormatInfo
mapReduceTriplets(Function1<EdgeTriplet<VD, ED>, Iterator<Tuple2<Object, A>>>, Function2<A, A, A>, Option<Tuple2<VertexRDD<?>, EdgeDirection>>, ClassTag<A>) - Method in class org.apache.spark.graphx.Graph: Aggregates values from the neighboring edges and vertices of each vertex.
mapReduceTriplets(Function1<EdgeTriplet<VD, ED>, Iterator<Tuple2<Object, A>>>, Function2<A, A, A>, Option<Tuple2<VertexRDD<?>, EdgeDirection>>, ClassTag<A>) - Method in class org.apache.spark.graphx.impl.GraphImpl
mapSideCombine() - Method in class org.apache.spark.ShuffleDependency
mapToDouble(DoubleFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to all elements of this RDD.
mapToPair(PairFunction<T, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return a new RDD by applying a function to all elements of this RDD.
mapToPair(PairFunction<T, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream by applying a function to all elements of this DStream.
mapTriplets(Function1<EdgeTriplet<VD, ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph: Transforms each edge attribute using the map function, passing it the adjacent vertex attributes as well.
mapTriplets(Function1<EdgeTriplet<VD, ED>, ED2>, TripletFields, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph: Transforms each edge attribute using the map function, passing it the adjacent vertex attributes as well.
mapTriplets(Function2<Object, Iterator<EdgeTriplet<VD, ED>>, Iterator<ED2>>, TripletFields, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph: Transforms each edge attribute a partition at a time using the map function, passing it the adjacent vertex attributes as well.
mapTriplets(Function2<Object, Iterator<EdgeTriplet<VD, ED>>, Iterator<ED2>>, TripletFields, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
MapType - Class in org.apache.spark.sql.types: :: DeveloperApi :: The data type for Maps.
MapType(DataType, DataType, boolean) - Constructor for class org.apache.spark.sql.types.MapType
MapType() - Constructor for class org.apache.spark.sql.types.MapType: No-arg constructor for kryo.
mapValues(Function<V, U>) - Method in class org.apache.spark.api.java.JavaPairRDD: Pass each value in the key-value pair RDD through a map function without changing the keys; this also retains the original RDD's partitioning.
mapValues(Function1<Edge<ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.EdgeRDD: Map the values in an edge partitioning preserving the structure but changing the values.
mapValues(Function1<Edge<ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
mapValues(Function1<VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
mapValues(Function2<Object, VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
mapValues(Function1<VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD: Maps each vertex attribute, preserving the index.
mapValues(Function2<Object, VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD: Maps each vertex attribute, additionally supplying the vertex ID.
mapValues(Function1<V, U>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Pass each value in the key-value pair RDD through a map function without changing the keys; this also retains the original RDD's partitioning.
mapValues(Function<V, U>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying a map function to the value of each key-value pairs in 'this' DStream without changing the key.
mapValues(Function1<V, U>, ClassTag) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying a map function to the value of each key-value pairs in 'this' DStream without changing the key.
mapVertices(Function2<Object, VD, VD2>, ClassTag<VD2>, Predef.$eq$colon$eq<VD, VD2>) - Method in class org.apache.spark.graphx.Graph: Transforms each vertex attribute in the graph using the map function.
mapVertices(Function2<Object, VD, VD2>, ClassTag<VD2>, Predef.$eq$colon$eq<VD, VD2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
mapWith(Function1<Object, A>, boolean, Function2<T, A, U>, ClassTag) - Method in class org.apache.spark.rdd.RDD: Maps f over this RDD, where f takes an additional parameter of type A.
mapWithState(StateSpec<K, V, StateType, MappedType>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: :: Experimental :: Return a JavaMapWithStateDStream by applying a function to every key-value element of this stream, while maintaining some state data for each unique key.
mapWithState(StateSpec<K, V, StateType, MappedType>, ClassTag<StateType>, ClassTag<MappedType>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: :: Experimental :: Return a MapWithStateDStream by applying a function to every key-value element of this stream, while maintaining some state data for each unique key.
MapWithStateDStream<KeyType,ValueType,StateType,MappedType> - Class in org.apache.spark.streaming.dstream: :: Experimental :: DStream representing the stream of data generated by mapWithState operation on a pair DStream.
MapWithStateDStream(StreamingContext, ClassTag<MappedType>) - Constructor for class org.apache.spark.streaming.dstream.MapWithStateDStream
mark(int) - Method in class org.apache.spark.storage.BufferReleasingInputStream
markSupported() - Method in class org.apache.spark.storage.BufferReleasingInputStream
mask(Graph<VD2, ED2>, ClassTag<VD2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph: Restricts the graph to only the vertices and edges that are also in other, but keeps the attributes from this graph.
mask(Graph<VD2, ED2>, ClassTag<VD2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
master() - Method in class org.apache.spark.api.java.JavaSparkContext
master() - Method in class org.apache.spark.SparkContext
Matrices - Class in org.apache.spark.mllib.linalg: Factory methods for Matrix.
Matrices() - Constructor for class org.apache.spark.mllib.linalg.Matrices
Matrix - Interface in org.apache.spark.mllib.linalg: Trait for a local matrix.
MatrixEntry - Class in org.apache.spark.mllib.linalg.distributed: Represents an entry in an distributed matrix.
MatrixEntry(long, long, double) - Constructor for class org.apache.spark.mllib.linalg.distributed.MatrixEntry
MatrixFactorizationModel - Class in org.apache.spark.mllib.recommendation: Model representing the result of matrix factorization.
MatrixFactorizationModel(int, RDD<Tuple2<Object, double[]>>, RDD<Tuple2<Object, double[]>>) - Constructor for class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
max() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Returns the maximum element from this RDD as defined by the default comparator natural order.
max(Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Returns the maximum element from this RDD as defined by the specified Comparator[T].
max() - Method in class org.apache.spark.ml.attribute.NumericAttribute
max() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer: Maximum value of each dimension.
max() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary: Maximum value of each column.
max(Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Returns the max of this RDD as defined by the implicit Ordering[T].
max(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the maximum value of the expression in a group.
max(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the maximum value of the column in a group.
max(String...) - Method in class org.apache.spark.sql.GroupedData: Compute the max value for each numeric columns for each group.
max(Seq<String>) - Method in class org.apache.spark.sql.GroupedData: Compute the max value for each numeric columns for each group.
max(Duration) - Method in class org.apache.spark.streaming.Duration
max(Time) - Method in class org.apache.spark.streaming.Time
max() - Method in class org.apache.spark.util.StatCounter
MAX_HASH_NNZ() - Static method in class org.apache.spark.mllib.linalg.Vectors: Max number of nonzero entries used in computing hash code.
MAX_LONG_DIGITS() - Static method in class org.apache.spark.sql.types.Decimal: Maximum number of decimal digits a Long can represent
MAX_PRECISION() - Static method in class org.apache.spark.sql.types.DecimalType
MAX_SCALE() - Static method in class org.apache.spark.sql.types.DecimalType
maxBins() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
maxBufferSizeMb() - Method in class org.apache.spark.serializer.KryoSerializer
maxCores() - Method in class org.apache.spark.status.api.v1.ApplicationInfo
maxDepth() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
maxIters() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
maxMem() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
maxMem() - Method in class org.apache.spark.storage.StorageStatus
maxMemory() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
maxMemoryInMB() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
maxNodesInLevel(int) - Static method in class org.apache.spark.mllib.tree.model.Node: Return the maximum number of nodes which can be in the given level of the tree.
maxVal() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
md5(Column) - Static method in class org.apache.spark.sql.functions: Calculates the MD5 digest of a binary column and returns the value as a 32 character hex string.
mean() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Compute the mean of this RDD's elements.
mean() - Method in class org.apache.spark.ml.feature.StandardScalerModel
mean() - Method in class org.apache.spark.mllib.feature.StandardScalerModel
mean() - Method in class org.apache.spark.mllib.random.ExponentialGenerator
mean() - Method in class org.apache.spark.mllib.random.LogNormalGenerator
mean() - Method in class org.apache.spark.mllib.random.PoissonGenerator
mean() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer: Sample mean of each dimension.
mean() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary: Sample mean vector.
mean() - Method in class org.apache.spark.partial.BoundedDouble
mean() - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Compute the mean of this RDD's elements.
mean(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the average of the values in a group.
mean(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the average of the values in a group.
mean(String...) - Method in class org.apache.spark.sql.GroupedData: Compute the average value for each numeric columns for each group.
mean(Seq<String>) - Method in class org.apache.spark.sql.GroupedData: Compute the average value for each numeric columns for each group.
mean() - Method in class org.apache.spark.util.StatCounter
meanAbsoluteError() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary: Returns the mean absolute error, which is a risk function corresponding to the expected value of the absolute error loss or l1-norm loss.
meanAbsoluteError() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics: Returns the mean absolute error, which is a risk function corresponding to the expected value of the absolute error loss or l1-norm loss.
meanApprox(long, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return the approximate mean of the elements in this RDD.
meanApprox(long) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Approximate operation to return the mean within a timeout.
meanApprox(long, double) - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Approximate operation to return the mean within a timeout.
meanAveragePrecision() - Method in class org.apache.spark.mllib.evaluation.RankingMetrics: Returns the mean average precision (MAP) of all the queries.
means() - Method in class org.apache.spark.mllib.clustering.ExpectationSum
meanSquaredError() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary: Returns the mean squared error, which is a risk function corresponding to the expected value of the squared error loss or quadratic loss.
meanSquaredError() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics: Returns the mean squared error, which is a risk function corresponding to the expected value of the squared error loss or quadratic loss.
MEMORY_AND_DISK - Static variable in class org.apache.spark.api.java.StorageLevels
MEMORY_AND_DISK() - Static method in class org.apache.spark.storage.StorageLevel
MEMORY_AND_DISK_2 - Static variable in class org.apache.spark.api.java.StorageLevels
MEMORY_AND_DISK_2() - Static method in class org.apache.spark.storage.StorageLevel
MEMORY_AND_DISK_SER - Static variable in class org.apache.spark.api.java.StorageLevels
MEMORY_AND_DISK_SER() - Static method in class org.apache.spark.storage.StorageLevel
MEMORY_AND_DISK_SER_2 - Static variable in class org.apache.spark.api.java.StorageLevels
MEMORY_AND_DISK_SER_2() - Static method in class org.apache.spark.storage.StorageLevel
MEMORY_ONLY - Static variable in class org.apache.spark.api.java.StorageLevels
MEMORY_ONLY() - Static method in class org.apache.spark.storage.StorageLevel
MEMORY_ONLY_2 - Static variable in class org.apache.spark.api.java.StorageLevels
MEMORY_ONLY_2() - Static method in class org.apache.spark.storage.StorageLevel
MEMORY_ONLY_SER - Static variable in class org.apache.spark.api.java.StorageLevels
MEMORY_ONLY_SER() - Static method in class org.apache.spark.storage.StorageLevel
MEMORY_ONLY_SER_2 - Static variable in class org.apache.spark.api.java.StorageLevels
MEMORY_ONLY_SER_2() - Static method in class org.apache.spark.storage.StorageLevel
memoryBytesSpilled() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
memoryBytesSpilled() - Method in class org.apache.spark.status.api.v1.StageData
memoryBytesSpilled() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
memoryBytesSpilled() - Method in class org.apache.spark.status.api.v1.TaskMetrics
MemoryEntry - Class in org.apache.spark.storage
MemoryEntry(Object, long, boolean) - Constructor for class org.apache.spark.storage.MemoryEntry
memoryManager() - Method in class org.apache.spark.SparkEnv
memoryPerExecutorMB() - Method in class org.apache.spark.status.api.v1.ApplicationInfo
memoryRemaining() - Method in class org.apache.spark.status.api.v1.RDDDataDistribution
memoryUsed() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
memoryUsed() - Method in class org.apache.spark.status.api.v1.RDDDataDistribution
memoryUsed() - Method in class org.apache.spark.status.api.v1.RDDPartitionInfo
memoryUsed() - Method in class org.apache.spark.status.api.v1.RDDStorageInfo
memRemaining() - Method in class org.apache.spark.storage.StorageStatus: Return the memory remaining in this block manager.
memSize() - Method in class org.apache.spark.storage.BlockStatus
memSize() - Method in class org.apache.spark.storage.BlockUpdatedInfo
memSize() - Method in class org.apache.spark.storage.RDDInfo
memUsed() - Method in class org.apache.spark.storage.StorageStatus: Return the memory used by this block manager.
memUsedByRdd(int) - Method in class org.apache.spark.storage.StorageStatus: Return the memory used by the given RDD in this block manager in O(1) time.
merge(R) - Method in class org.apache.spark.Accumulable: Merge two accumulable objects together
merge(LogisticAggregator) - Method in class org.apache.spark.ml.classification.LogisticAggregator: Merge another LogisticAggregator, and update the loss and gradient of the objective function.
merge(AFTAggregator) - Method in class org.apache.spark.ml.regression.AFTAggregator
merge(LeastSquaresAggregator) - Method in class org.apache.spark.ml.regression.LeastSquaresAggregator: Merge another LeastSquaresAggregator, and update the loss and gradient of the objective function.
merge(IDF.DocumentFrequencyAggregator) - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator: Merges another.
merge(MultivariateOnlineSummarizer) - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer: Merge another MultivariateOnlineSummarizer, and update the statistical summary.
merge(B, B) - Method in class org.apache.spark.sql.expressions.Aggregator: Merge two intermediate values.
merge(MutableAggregationBuffer, Row) - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction: Merges two aggregation buffers and stores the updated buffer values back to buffer1.
merge(double) - Method in class org.apache.spark.util.StatCounter: Add a value into this StatCounter, updating the internal statistics.
merge(TraversableOnce<Object>) - Method in class org.apache.spark.util.StatCounter: Add multiple values into this StatCounter, updating the internal statistics.
merge(StatCounter) - Method in class org.apache.spark.util.StatCounter: Merge another StatCounter into this one, adding up the internal statistics.
mergeCombiners() - Method in class org.apache.spark.Aggregator
mergeValue() - Method in class org.apache.spark.Aggregator
MESOS_REGEX() - Static method in class org.apache.spark.SparkMasterRegex
message() - Method in class org.apache.spark.FetchFailed
message() - Method in exception org.apache.spark.sql.AnalysisException
Metadata - Class in org.apache.spark.sql.types: :: DeveloperApi ::
Metadata() - Constructor for class org.apache.spark.sql.types.Metadata: No-arg constructor for kryo.
metadata() - Method in class org.apache.spark.sql.types.StructField
metadata() - Method in class org.apache.spark.streaming.scheduler.StreamInputInfo
METADATA_KEY_DESCRIPTION() - Static method in class org.apache.spark.streaming.scheduler.StreamInputInfo: The key for description in StreamInputInfo.metadata.
MetadataBuilder - Class in org.apache.spark.sql.types: :: DeveloperApi ::
MetadataBuilder() - Constructor for class org.apache.spark.sql.types.MetadataBuilder
metadataDescription() - Method in class org.apache.spark.streaming.scheduler.StreamInputInfo
metadataHive() - Method in class org.apache.spark.sql.hive.HiveContext: The copy of the Hive client that is used to retrieve metadata from the Hive MetaStore.
method() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
MethodIdentifier<T> - Class in org.apache.spark.util: Helper class to identify a method.
MethodIdentifier(Class<T>, String, String) - Constructor for class org.apache.spark.util.MethodIdentifier
metricName() - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator: param for metric name in evaluation Default: areaUnderROC
metricName() - Method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator: param for metric name in evaluation (supports "f1" (default), "precision", "recall", "weightedPrecision", "weightedRecall")
metricName() - Method in class org.apache.spark.ml.evaluation.RegressionEvaluator: param for metric name in evaluation (supports "rmse" (default), "mse", "r2", and "mae")
metrics() - Method in class org.apache.spark.ExceptionFailure
metricsSystem() - Method in class org.apache.spark.SparkContext
metricsSystem() - Method in class org.apache.spark.SparkEnv
MFDataGenerator - Class in org.apache.spark.mllib.util: :: DeveloperApi :: Generate RDD(s) containing data for Matrix Factorization.
MFDataGenerator() - Constructor for class org.apache.spark.mllib.util.MFDataGenerator
microF1Measure() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics: Returns micro-averaged label-based f1-measure (equals to micro-averaged document-based f1-measure)
microPrecision() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics: Returns micro-averaged label-based precision (equals to micro-averaged document-based precision)
microRecall() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics: Returns micro-averaged label-based recall (equals to micro-averaged document-based recall)
milliseconds() - Method in class org.apache.spark.streaming.Duration
milliseconds(long) - Static method in class org.apache.spark.streaming.Durations
Milliseconds - Class in org.apache.spark.streaming: Helper object that creates instance of Duration representing a given number of milliseconds.
Milliseconds() - Constructor for class org.apache.spark.streaming.Milliseconds
milliseconds() - Method in class org.apache.spark.streaming.Time
millisToString(long) - Static method in class org.apache.spark.scheduler.StatsReportListener: Reformat a time interval in milliseconds to a prettier format for output
min() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Returns the minimum element from this RDD as defined by the default comparator natural order.
min(Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Returns the minimum element from this RDD as defined by the specified Comparator[T].
min() - Method in class org.apache.spark.ml.attribute.NumericAttribute
min() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer: Minimum value of each dimension.
min() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary: Minimum value of each column.
min(Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Returns the min of this RDD as defined by the implicit Ordering[T].
min(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the minimum value of the expression in a group.
min(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the minimum value of the column in a group.
min(String...) - Method in class org.apache.spark.sql.GroupedData: Compute the min value for each numeric column for each group.
min(Seq<String>) - Method in class org.apache.spark.sql.GroupedData: Compute the min value for each numeric column for each group.
min(Duration) - Method in class org.apache.spark.streaming.Duration
min(Time) - Method in class org.apache.spark.streaming.Time
min() - Method in class org.apache.spark.util.StatCounter
minDocFreq() - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
minDocFreq() - Method in class org.apache.spark.mllib.feature.IDF
minInfoGain() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
minInstancesPerNode() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
MinMax() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
MinMaxScaler - Class in org.apache.spark.ml.feature: :: Experimental :: Rescale each feature individually to a common range [min, max] linearly using column summary statistics, which is also known as min-max normalization or Rescaling.
MinMaxScaler(String) - Constructor for class org.apache.spark.ml.feature.MinMaxScaler
MinMaxScaler() - Constructor for class org.apache.spark.ml.feature.MinMaxScaler
MinMaxScalerModel - Class in org.apache.spark.ml.feature
minTokenLength() - Method in class org.apache.spark.ml.feature.RegexTokenizer: Minimum token length, >= 0.
minus(RDD<Tuple2<Object, VD>>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
minus(VertexRDD<VD>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
minus(RDD<Tuple2<Object, VD>>) - Method in class org.apache.spark.graphx.VertexRDD: For each VertexId present in both this and other, minus will act as a set difference operation returning only those unique VertexId's present in this.
minus(VertexRDD<VD>) - Method in class org.apache.spark.graphx.VertexRDD: For each VertexId present in both this and other, minus will act as a set difference operation returning only those unique VertexId's present in this.
minus(Object) - Method in class org.apache.spark.sql.Column: Subtraction.
minus(Duration) - Method in class org.apache.spark.streaming.Duration
minus(Time) - Method in class org.apache.spark.streaming.Time
minus(Duration) - Method in class org.apache.spark.streaming.Time
minute(Column) - Static method in class org.apache.spark.sql.functions: Extracts the minutes as an integer from a given date/timestamp/string.
minutes() - Static method in class org.apache.spark.scheduler.StatsReportListener
minutes(long) - Static method in class org.apache.spark.streaming.Durations
Minutes - Class in org.apache.spark.streaming: Helper object that creates instance of Duration representing a given number of minutes.
Minutes() - Constructor for class org.apache.spark.streaming.Minutes
minVal() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
mkString() - Method in interface org.apache.spark.sql.Row: Displays all elements of this sequence in a string (without a separator).
mkString(String) - Method in interface org.apache.spark.sql.Row: Displays all elements of this sequence in a string using a separator string.
mkString(String, String, String) - Method in interface org.apache.spark.sql.Row: Displays all elements of this traversable or iterator in a string using start, end, and separator strings.
MLPairRDDFunctions<K,V> - Class in org.apache.spark.mllib.rdd: Machine learning specific Pair RDD functions.
MLPairRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.mllib.rdd.MLPairRDDFunctions
MLReadable<T> - Interface in org.apache.spark.ml.util: Trait for objects that provide MLReader.
MLReader<T> - Class in org.apache.spark.ml.util: Abstract class for utility classes that can load ML instances.
MLReader() - Constructor for class org.apache.spark.ml.util.MLReader
MLUtils - Class in org.apache.spark.mllib.util: Helper methods to load, save and pre-process data used in ML Lib.
MLUtils() - Constructor for class org.apache.spark.mllib.util.MLUtils
MLWritable - Interface in org.apache.spark.ml.util: Trait for classes that provide MLWriter.
MLWriter - Class in org.apache.spark.ml.util: Abstract class for utility classes that can save ML instances.
MLWriter() - Constructor for class org.apache.spark.ml.util.MLWriter
mod(Object) - Method in class org.apache.spark.sql.Column: Modulo (a.k.a.
mode(SaveMode) - Method in class org.apache.spark.sql.DataFrameWriter: Specifies the behavior when data or table already exists.
mode(String) - Method in class org.apache.spark.sql.DataFrameWriter: Specifies the behavior when data or table already exists.
Model<M extends Model<M>> - Class in org.apache.spark.ml: :: DeveloperApi :: A fitted model, i.e., a Transformer produced by an Estimator.
Model() - Constructor for class org.apache.spark.ml.Model
model() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary
model() - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
model() - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
model() - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm: The model to be updated and used for prediction.
model() - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
models() - Method in class org.apache.spark.ml.classification.OneVsRestModel
modelType() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
modificationTime() - Method in class org.apache.spark.sql.sources.HadoopFsRelation.FakeFileStatus
MODULE$ - Static variable in class org.apache.spark.AccumulatorParam.DoubleAccumulatorParam$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.AccumulatorParam.FloatAccumulatorParam$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.AccumulatorParam.IntAccumulatorParam$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.AccumulatorParam.LongAccumulatorParam$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.graphx.PartitionStrategy.CanonicalRandomVertexCut$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.graphx.PartitionStrategy.EdgePartition1D$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.graphx.PartitionStrategy.EdgePartition2D$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.graphx.PartitionStrategy.RandomVertexCut$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.ml.recommendation.ALS.Rating$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.mllib.clustering.PowerIterationClustering.Assignment$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.SparkContext.DoubleAccumulatorParam$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.SparkContext.FloatAccumulatorParam$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.SparkContext.IntAccumulatorParam$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.SparkContext.LongAccumulatorParam$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.sql.sources.HadoopFsRelation.FakeFileStatus$: Static reference to the singleton instance of this Scala object.
MODULE$ - Static variable in class org.apache.spark.util.Vector.VectorAccumParam$: Static reference to the singleton instance of this Scala object.
monotonically_increasing_id() - Static method in class org.apache.spark.sql.functions: A column expression that generates monotonically increasing 64-bit integers.
monotonicallyIncreasingId() - Static method in class org.apache.spark.sql.functions: A column expression that generates monotonically increasing 64-bit integers.
month(Column) - Static method in class org.apache.spark.sql.functions: Extracts the month as an integer from a given date/timestamp/string.
months_between(Column, Column) - Static method in class org.apache.spark.sql.functions
MQTTUtils - Class in org.apache.spark.streaming.mqtt
MQTTUtils() - Constructor for class org.apache.spark.streaming.mqtt.MQTTUtils
MsSqlServerDialect - Class in org.apache.spark.sql.jdbc
MsSqlServerDialect() - Constructor for class org.apache.spark.sql.jdbc.MsSqlServerDialect
mu() - Method in class org.apache.spark.mllib.stat.distribution.MultivariateGaussian
MulticlassClassificationEvaluator - Class in org.apache.spark.ml.evaluation: :: Experimental :: Evaluator for multiclass classification, which expects two input columns: score and label.
MulticlassClassificationEvaluator(String) - Constructor for class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
MulticlassClassificationEvaluator() - Constructor for class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
MulticlassMetrics - Class in org.apache.spark.mllib.evaluation: ::Experimental:: Evaluator for multiclass classification.
MulticlassMetrics(RDD<Tuple2<Object, Object>>) - Constructor for class org.apache.spark.mllib.evaluation.MulticlassMetrics
MultilabelMetrics - Class in org.apache.spark.mllib.evaluation: Evaluator for multilabel classification.
MultilabelMetrics(RDD<Tuple2<double[], double[]>>) - Constructor for class org.apache.spark.mllib.evaluation.MultilabelMetrics
multiLabelValidator(int) - Static method in class org.apache.spark.mllib.util.DataValidators: Function to check if labels used for k class multi-label classification are in the range of {0, 1, ..., k - 1}.
MultilayerPerceptronClassificationModel - Class in org.apache.spark.ml.classification: :: Experimental :: Classification model based on the Multilayer Perceptron.
MultilayerPerceptronClassifier - Class in org.apache.spark.ml.classification: :: Experimental :: Classifier trainer based on the Multilayer Perceptron.
MultilayerPerceptronClassifier(String) - Constructor for class org.apache.spark.ml.classification.MultilayerPerceptronClassifier
MultilayerPerceptronClassifier() - Constructor for class org.apache.spark.ml.classification.MultilayerPerceptronClassifier
Multinomial() - Static method in class org.apache.spark.mllib.classification.NaiveBayes
multiply(BlockMatrix) - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix: Left multiplies this BlockMatrix to other, another BlockMatrix.
multiply(Matrix) - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
multiply(Matrix) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Multiply this matrix by a local matrix on the right.
multiply(DenseMatrix) - Method in interface org.apache.spark.mllib.linalg.Matrix: Convenience method for `Matrix`-`DenseMatrix` multiplication.
multiply(DenseVector) - Method in interface org.apache.spark.mllib.linalg.Matrix: Convenience method for `Matrix`-`DenseVector` multiplication.
multiply(Vector) - Method in interface org.apache.spark.mllib.linalg.Matrix: Convenience method for `Matrix`-`Vector` multiplication.
multiply(Object) - Method in class org.apache.spark.sql.Column: Multiplication of this expression and another expression.
multiply(double) - Method in class org.apache.spark.util.Vector
MultivariateGaussian - Class in org.apache.spark.mllib.stat.distribution: :: DeveloperApi :: This class provides basic functionality for a Multivariate Gaussian (Normal) Distribution.
MultivariateGaussian(Vector, Matrix) - Constructor for class org.apache.spark.mllib.stat.distribution.MultivariateGaussian
MultivariateOnlineSummarizer - Class in org.apache.spark.mllib.stat: :: DeveloperApi :: MultivariateOnlineSummarizer implements MultivariateStatisticalSummary to compute the mean, variance, minimum, maximum, counts, and nonzero counts for instances in sparse or dense vector format in a online fashion.
MultivariateOnlineSummarizer() - Constructor for class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
MultivariateStatisticalSummary - Interface in org.apache.spark.mllib.stat: Trait for multivariate statistical summary of a data matrix.
mustCheckpoint() - Method in class org.apache.spark.streaming.dstream.DStream
MutableAggregationBuffer - Class in org.apache.spark.sql.expressions: :: Experimental :: A Row representing an mutable aggregation buffer.
MutableAggregationBuffer() - Constructor for class org.apache.spark.sql.expressions.MutableAggregationBuffer
MutablePair<T1,T2> - Class in org.apache.spark.util: :: DeveloperApi :: A tuple of 2 elements.
MutablePair(T1, T2) - Constructor for class org.apache.spark.util.MutablePair
MutablePair() - Constructor for class org.apache.spark.util.MutablePair: No-arg constructor for serialization
myName() - Method in class org.apache.spark.util.InnerClosureFinder
MySQLDialect - Class in org.apache.spark.sql.jdbc
MySQLDialect() - Constructor for class org.apache.spark.sql.jdbc.MySQLDialect

N

n() - Method in class org.apache.spark.ml.feature.NGram: Minimum n-gram length, >= 1.
na() - Method in class org.apache.spark.sql.DataFrame: Returns a DataFrameNaFunctions for working with missing data.
NaiveBayes - Class in org.apache.spark.ml.classification: :: Experimental :: Naive Bayes Classifiers.
NaiveBayes(String) - Constructor for class org.apache.spark.ml.classification.NaiveBayes
NaiveBayes() - Constructor for class org.apache.spark.ml.classification.NaiveBayes
NaiveBayes - Class in org.apache.spark.mllib.classification
NaiveBayes(double) - Constructor for class org.apache.spark.mllib.classification.NaiveBayes
NaiveBayes() - Constructor for class org.apache.spark.mllib.classification.NaiveBayes
NaiveBayesModel - Class in org.apache.spark.ml.classification: :: Experimental :: Model produced by NaiveBayes param: pi log of class priors, whose dimension is C (number of classes) param: theta log of class conditional probabilities, whose dimension is C (number of classes) by D (number of features)
NaiveBayesModel - Class in org.apache.spark.mllib.classification: Model for Naive Bayes Classifiers.
name() - Method in class org.apache.spark.Accumulable
name() - Method in interface org.apache.spark.api.java.JavaRDDLike
name() - Method in class org.apache.spark.ml.attribute.Attribute: Name of the attribute.
name() - Method in class org.apache.spark.ml.attribute.AttributeGroup
name() - Method in class org.apache.spark.ml.attribute.AttributeType
name() - Method in class org.apache.spark.ml.attribute.BinaryAttribute
name() - Method in class org.apache.spark.ml.attribute.NominalAttribute
name() - Method in class org.apache.spark.ml.attribute.NumericAttribute
name() - Static method in class org.apache.spark.ml.attribute.UnresolvedAttribute
name() - Method in class org.apache.spark.ml.param.Param
name() - Method in class org.apache.spark.rdd.RDD: A friendly name for this RDD
name() - Method in class org.apache.spark.scheduler.AccumulableInfo
name() - Method in class org.apache.spark.scheduler.StageInfo
name() - Method in interface org.apache.spark.SparkStageInfo
name() - Method in class org.apache.spark.SparkStageInfoImpl
name() - Method in class org.apache.spark.sql.types.StructField
name() - Method in class org.apache.spark.status.api.v1.AccumulableInfo
name() - Method in class org.apache.spark.status.api.v1.ApplicationInfo
name() - Method in class org.apache.spark.status.api.v1.JobData
name() - Method in class org.apache.spark.status.api.v1.RDDStorageInfo
name() - Method in class org.apache.spark.status.api.v1.StageData
name() - Method in class org.apache.spark.storage.BlockId: A globally unique identifier for this Block.
name() - Method in class org.apache.spark.storage.BroadcastBlockId
name() - Method in class org.apache.spark.storage.RDDBlockId
name() - Method in class org.apache.spark.storage.RDDInfo
name() - Method in class org.apache.spark.storage.ShuffleBlockId
name() - Method in class org.apache.spark.storage.ShuffleDataBlockId
name() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
name() - Method in class org.apache.spark.storage.StreamBlockId
name() - Method in class org.apache.spark.storage.TaskResultBlockId
name() - Method in class org.apache.spark.streaming.scheduler.OutputOperationInfo
name() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
name() - Method in class org.apache.spark.util.MethodIdentifier
names() - Method in class org.apache.spark.ml.feature.VectorSlicer: An array of feature names to select features from a vector column.
nanvl(Column, Column) - Static method in class org.apache.spark.sql.functions: Returns col1 if it is not NaN, or col2 if col1 is NaN.
NarrowDependency<T> - Class in org.apache.spark: :: DeveloperApi :: Base class for dependencies where each partition of the child RDD depends on a small number of partitions of the parent RDD.
NarrowDependency(RDD<T>) - Constructor for class org.apache.spark.NarrowDependency
ndcgAt(int) - Method in class org.apache.spark.mllib.evaluation.RankingMetrics: Compute the average NDCG value of all the queries, truncated at ranking position k.
needConversion() - Method in class org.apache.spark.sql.sources.BaseRelation: Whether does it need to convert the objects in Row to internal representation, for example: java.lang.String -> UTF8String java.lang.Decimal -> Decimal
negate(Column) - Static method in class org.apache.spark.sql.functions: Unary minus, i.e.
networkStream(Receiver<T>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext: Deprecated.
As of 1.0.0", replaced by receiverStream.
newAPIHadoopFile(String, Class<F>, Class<K>, Class<V>, Configuration) - Method in class org.apache.spark.api.java.JavaSparkContext: Get an RDD for a given Hadoop file with an arbitrary new API InputFormat and extra configuration options to pass to the input format.
newAPIHadoopFile(String, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.SparkContext: Get an RDD for a Hadoop file with an arbitrary new API InputFormat.
newAPIHadoopFile(String, Class<F>, Class<K>, Class<V>, Configuration) - Method in class org.apache.spark.SparkContext: Get an RDD for a given Hadoop file with an arbitrary new API InputFormat and extra configuration options to pass to the input format.
newAPIHadoopRDD(Configuration, Class<F>, Class<K>, Class<V>) - Method in class org.apache.spark.api.java.JavaSparkContext: Get an RDD for a given Hadoop file with an arbitrary new API InputFormat and extra configuration options to pass to the input format.
newAPIHadoopRDD(Configuration, Class<F>, Class<K>, Class<V>) - Method in class org.apache.spark.SparkContext: Get an RDD for a given Hadoop file with an arbitrary new API InputFormat and extra configuration options to pass to the input format.
newBooleanEncoder() - Method in class org.apache.spark.sql.SQLImplicits
newBroadcast(T, boolean, long, ClassTag<T>) - Method in interface org.apache.spark.broadcast.BroadcastFactory: Creates a new broadcast variable.
newBroadcast(T, boolean, long, ClassTag<T>) - Method in class org.apache.spark.broadcast.HttpBroadcastFactory
newBroadcast(T, boolean, long, ClassTag<T>) - Method in class org.apache.spark.broadcast.TorrentBroadcastFactory
newByteEncoder() - Method in class org.apache.spark.sql.SQLImplicits
newDoubleEncoder() - Method in class org.apache.spark.sql.SQLImplicits
newFloatEncoder() - Method in class org.apache.spark.sql.SQLImplicits
NewHadoopRDD<K,V> - Class in org.apache.spark.rdd: :: DeveloperApi :: An RDD that provides core functionality for reading data stored in Hadoop (e.g., files in HDFS, sources in HBase, or S3), using the new MapReduce API (org.apache.hadoop.mapreduce).
NewHadoopRDD(SparkContext, Class<? extends InputFormat<K, V>>, Class<K>, Class<V>, Configuration) - Constructor for class org.apache.spark.rdd.NewHadoopRDD
newInstance() - Method in class org.apache.spark.serializer.JavaSerializer
newInstance() - Method in class org.apache.spark.serializer.KryoSerializer
newInstance() - Method in class org.apache.spark.serializer.Serializer: Creates a new SerializerInstance.
newInstance(String, StructType, TaskAttemptContext) - Method in class org.apache.spark.sql.sources.OutputWriterFactory: When writing to a HadoopFsRelation, this method gets called by each task on executor side to instantiate new OutputWriters.
newIntEncoder() - Method in class org.apache.spark.sql.SQLImplicits
newKryo() - Method in class org.apache.spark.serializer.KryoSerializer
newKryoOutput() - Method in class org.apache.spark.serializer.KryoSerializer
newLongEncoder() - Method in class org.apache.spark.sql.SQLImplicits
newProductEncoder(TypeTags.TypeTag<T>) - Method in class org.apache.spark.sql.SQLImplicits
newSession() - Method in class org.apache.spark.sql.hive.HiveContext: Returns a new HiveContext as new session, which will have separated SQLConf, UDF/UDAF, temporary tables and SessionState, but sharing the same CacheManager, IsolatedClientLoader and Hive client (both of execution and metadata) with existing HiveContext.
newSession() - Method in class org.apache.spark.sql.SQLContext: Returns a SQLContext as new session, with separated SQL configurations, temporary tables, registered functions, but sharing the same SparkContext, CacheManager, SQLListener and SQLTab.
newShortEncoder() - Method in class org.apache.spark.sql.SQLImplicits
newStringEncoder() - Method in class org.apache.spark.sql.SQLImplicits
newTemporaryConfiguration() - Static method in class org.apache.spark.sql.hive.HiveContext: Constructs a configuration for hive, where the metastore is located in a temp directory.
next() - Method in class org.apache.spark.InterruptibleIterator
next() - Method in interface org.apache.spark.mllib.clustering.LDAOptimizer
next() - Method in class org.apache.spark.rdd.PartitionCoalescer.LocationIterator
next_day(Column, String) - Static method in class org.apache.spark.sql.functions: Given a date column, returns the first date which is later than the value of the date column that is on the specified day of the week.
nextValue() - Method in class org.apache.spark.mllib.random.ExponentialGenerator
nextValue() - Method in class org.apache.spark.mllib.random.GammaGenerator
nextValue() - Method in class org.apache.spark.mllib.random.LogNormalGenerator
nextValue() - Method in class org.apache.spark.mllib.random.PoissonGenerator
nextValue() - Method in interface org.apache.spark.mllib.random.RandomDataGenerator: Returns an i.i.d.
nextValue() - Method in class org.apache.spark.mllib.random.StandardNormalGenerator
nextValue() - Method in class org.apache.spark.mllib.random.UniformGenerator
nextValue() - Method in class org.apache.spark.mllib.random.WeibullGenerator
NGram - Class in org.apache.spark.ml.feature: :: Experimental :: A feature transformer that converts the input array of strings into an array of n-grams.
NGram(String) - Constructor for class org.apache.spark.ml.feature.NGram
NGram() - Constructor for class org.apache.spark.ml.feature.NGram
NO_PREF() - Static method in class org.apache.spark.scheduler.TaskLocality
Node - Class in org.apache.spark.ml.tree: :: DeveloperApi :: Decision tree node interface.
Node() - Constructor for class org.apache.spark.ml.tree.Node
Node - Class in org.apache.spark.mllib.tree.model: :: DeveloperApi :: Node in a decision tree.
Node(int, Predict, double, boolean, Option<Split>, Option<Node>, Option<Node>, Option<InformationGainStats>) - Constructor for class org.apache.spark.mllib.tree.model.Node
NODE_LOCAL() - Static method in class org.apache.spark.scheduler.TaskLocality
noLocality() - Method in class org.apache.spark.rdd.PartitionCoalescer
Nominal() - Static method in class org.apache.spark.ml.attribute.AttributeType: Nominal type.
NominalAttribute - Class in org.apache.spark.ml.attribute: :: DeveloperApi :: A nominal attribute.
NONE - Static variable in class org.apache.spark.api.java.StorageLevels
None - Static variable in class org.apache.spark.graphx.TripletFields: None of the triplet fields are exposed.
NONE() - Static method in class org.apache.spark.scheduler.SchedulingMode
NONE() - Static method in class org.apache.spark.storage.StorageLevel
NoopDialect - Class in org.apache.spark.sql.jdbc: NOOP dialect object, always returning the neutral element.
NoopDialect() - Constructor for class org.apache.spark.sql.jdbc.NoopDialect
norm(Vector, double) - Static method in class org.apache.spark.mllib.linalg.Vectors: Returns the p-norm of this vector.
Normalizer - Class in org.apache.spark.ml.feature: :: Experimental :: Normalize a vector to have unit norm using the given p-norm.
Normalizer(String) - Constructor for class org.apache.spark.ml.feature.Normalizer
Normalizer() - Constructor for class org.apache.spark.ml.feature.Normalizer
Normalizer - Class in org.apache.spark.mllib.feature: Normalizes samples individually to unit L^p^ norm
Normalizer(double) - Constructor for class org.apache.spark.mllib.feature.Normalizer
Normalizer() - Constructor for class org.apache.spark.mllib.feature.Normalizer
normalizeToProbabilitiesInPlace(DenseVector) - Static method in class org.apache.spark.ml.classification.ProbabilisticClassificationModel: Normalize a vector of raw predictions to be a multinomial probability vector, in place.
normalJavaRDD(JavaSparkContext, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Java-friendly version of RandomRDDs.normalRDD(org.apache.spark.SparkContext, long, int, long).
normalJavaRDD(JavaSparkContext, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.normalJavaRDD(org.apache.spark.api.java.JavaSparkContext, long, int, long) with the default seed.
normalJavaRDD(JavaSparkContext, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.normalJavaRDD(org.apache.spark.api.java.JavaSparkContext, long, int, long) with the default number of partitions and the default seed.
normalJavaVectorRDD(JavaSparkContext, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Java-friendly version of RandomRDDs.normalVectorRDD(org.apache.spark.SparkContext, long, int, int, long).
normalJavaVectorRDD(JavaSparkContext, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.normalJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, long, int, int, long) with the default seed.
normalJavaVectorRDD(JavaSparkContext, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.normalJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, long, int, int, long) with the default number of partitions and the default seed.
normalRDD(SparkContext, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Generates an RDD comprised of i.i.d. samples from the standard normal distribution.
normalVectorRDD(SparkContext, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Generates an RDD[Vector] with vectors containing i.i.d. samples drawn from the standard normal distribution.
normL1() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer: L1 norm of each dimension.
normL1() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary: L1 norm of each column
normL2() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer: L2 (Euclidian) norm of each dimension.
normL2() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary: Euclidean magnitude of each column
normPdf(double, double, double, double) - Static method in class org.apache.spark.mllib.stat.KernelDensity: Evaluates the PDF of a normal distribution.
not(Column) - Static method in class org.apache.spark.sql.functions: Inversion of boolean expression, i.e.
Not - Class in org.apache.spark.sql.sources: A filter that evaluates to true iff child is evaluated to false.
Not(Filter) - Constructor for class org.apache.spark.sql.sources.Not
notEqual(Object) - Method in class org.apache.spark.sql.Column: Inequality test.
ntile(int) - Static method in class org.apache.spark.sql.functions: Window function: returns the ntile group id (from 1 to n inclusive) in an ordered window partition.
nullable() - Method in class org.apache.spark.sql.types.StructField
nullHypothesis() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
nullHypothesis() - Method in class org.apache.spark.mllib.stat.test.KolmogorovSmirnovTestResult
nullHypothesis() - Method in interface org.apache.spark.mllib.stat.test.TestResult: Null hypothesis of the test.
NullType - Static variable in class org.apache.spark.sql.types.DataTypes: Gets the NullType object.
NullType - Class in org.apache.spark.sql.types: :: DeveloperApi :: The data type representing NULL values.
numActives() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
numActives() - Method in class org.apache.spark.mllib.linalg.DenseVector
numActives() - Method in interface org.apache.spark.mllib.linalg.Matrix: Find the number of values stored explicitly.
numActives() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
numActives() - Method in class org.apache.spark.mllib.linalg.SparseVector
numActives() - Method in interface org.apache.spark.mllib.linalg.Vector: Number of active entries.
numActiveStages() - Method in class org.apache.spark.status.api.v1.JobData
numActiveTasks() - Method in interface org.apache.spark.SparkStageInfo
numActiveTasks() - Method in class org.apache.spark.SparkStageInfoImpl
numActiveTasks() - Method in class org.apache.spark.status.api.v1.JobData
numActiveTasks() - Method in class org.apache.spark.status.api.v1.StageData
numAttributes() - Method in class org.apache.spark.ml.attribute.AttributeGroup
numberOfHiccups() - Method in class org.apache.spark.streaming.receiver.Statistics
numberOfMsgs() - Method in class org.apache.spark.streaming.receiver.Statistics
numberOfWorkers() - Method in class org.apache.spark.streaming.receiver.Statistics
numBins() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
numBlocks() - Method in class org.apache.spark.storage.StorageStatus: Return the number of blocks stored in this block manager in O(RDDs) time.
numCachedPartitions() - Method in class org.apache.spark.status.api.v1.RDDStorageInfo
numCachedPartitions() - Method in class org.apache.spark.storage.RDDInfo
numClasses() - Method in class org.apache.spark.ml.classification.ClassificationModel: Number of classes (values which the label can take).
numClasses() - Method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
numClasses() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
numClasses() - Method in class org.apache.spark.ml.classification.NaiveBayesModel
numClasses() - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel
numClasses() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
numClasses() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
numColBlocks() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
numCols() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
numCols() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
numCols() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
numCols() - Method in interface org.apache.spark.mllib.linalg.distributed.DistributedMatrix: Gets or computes the number of columns.
numCols() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
numCols() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Gets or computes the number of columns.
numCols() - Method in interface org.apache.spark.mllib.linalg.Matrix: Number of columns.
numCols() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
numCompletedJobs() - Method in class org.apache.spark.ui.jobs.JobProgressListener
numCompletedStages() - Method in class org.apache.spark.status.api.v1.JobData
numCompletedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
numCompletedTasks() - Method in interface org.apache.spark.SparkStageInfo
numCompletedTasks() - Method in class org.apache.spark.SparkStageInfoImpl
numCompletedTasks() - Method in class org.apache.spark.status.api.v1.JobData
numCompleteTasks() - Method in class org.apache.spark.status.api.v1.StageData
numEdges() - Method in class org.apache.spark.graphx.GraphOps: The number of edges in the graph.
Numeric() - Static method in class org.apache.spark.ml.attribute.AttributeType: Numeric type.
numeric() - Method in class org.apache.spark.sql.types.ByteType
numeric() - Method in class org.apache.spark.sql.types.DecimalType
numeric() - Method in class org.apache.spark.sql.types.DoubleType
numeric() - Method in class org.apache.spark.sql.types.FloatType
numeric() - Method in class org.apache.spark.sql.types.IntegerType
numeric() - Method in class org.apache.spark.sql.types.LongType
numeric() - Method in class org.apache.spark.sql.types.NumericType
numeric() - Method in class org.apache.spark.sql.types.ShortType
NumericAttribute - Class in org.apache.spark.ml.attribute: :: DeveloperApi :: A numeric attribute with optional summary statistics.
numericColumns() - Method in class org.apache.spark.sql.DataFrame
numericRDDToDoubleRDDFunctions(RDD<T>, Numeric<T>) - Static method in class org.apache.spark.rdd.RDD
numericRDDToDoubleRDDFunctions(RDD<T>, Numeric<T>) - Static method in class org.apache.spark.SparkContext
NumericType - Class in org.apache.spark.sql.types: :: DeveloperApi :: Numeric data types.
NumericType() - Constructor for class org.apache.spark.sql.types.NumericType
numFailedJobs() - Method in class org.apache.spark.ui.jobs.JobProgressListener
numFailedStages() - Method in class org.apache.spark.status.api.v1.JobData
numFailedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
numFailedTasks() - Method in interface org.apache.spark.SparkStageInfo
numFailedTasks() - Method in class org.apache.spark.SparkStageInfoImpl
numFailedTasks() - Method in class org.apache.spark.status.api.v1.JobData
numFailedTasks() - Method in class org.apache.spark.status.api.v1.StageData
numFeatures() - Method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
numFeatures() - Method in class org.apache.spark.ml.classification.GBTClassificationModel
numFeatures() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
numFeatures() - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassificationModel
numFeatures() - Method in class org.apache.spark.ml.classification.NaiveBayesModel
numFeatures() - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel
numFeatures() - Method in class org.apache.spark.ml.feature.HashingTF: Number of features.
numFeatures() - Method in class org.apache.spark.ml.feature.VectorIndexerModel
numFeatures() - Method in class org.apache.spark.ml.PredictionModel: Returns the number of features the model was trained on.
numFeatures() - Method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
numFeatures() - Method in class org.apache.spark.ml.regression.GBTRegressionModel
numFeatures() - Method in class org.apache.spark.ml.regression.LinearRegressionModel
numFeatures() - Method in class org.apache.spark.ml.regression.RandomForestRegressionModel
numFeatures() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
numFeatures() - Method in class org.apache.spark.mllib.feature.HashingTF
numFeatures() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm: The dimension of training features.
numInstances() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary: Number of instances in DataFrame predictions
numIterations() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
numNodes() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel: Get number of nodes in tree, including leaf nodes.
numNonzeros() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
numNonzeros() - Method in class org.apache.spark.mllib.linalg.DenseVector
numNonzeros() - Method in interface org.apache.spark.mllib.linalg.Matrix: Find the number of non-zero active values.
numNonzeros() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
numNonzeros() - Method in class org.apache.spark.mllib.linalg.SparseVector
numNonzeros() - Method in interface org.apache.spark.mllib.linalg.Vector: Number of nonzero elements.
numNonzeros() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer: Number of nonzero elements in each dimension.
numNonzeros() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary: Number of nonzero elements (including explicitly presented zero values) in each column.
numOfLinearPredictor() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm: In GeneralizedLinearModel, only single linear predictor is allowed for both weights and intercept.
numPartitions() - Method in class org.apache.spark.HashPartitioner
numPartitions() - Method in class org.apache.spark.Partitioner
numPartitions() - Method in class org.apache.spark.RangePartitioner
numPartitions() - Method in class org.apache.spark.status.api.v1.RDDStorageInfo
numPartitions() - Method in class org.apache.spark.storage.RDDInfo
numPartitions(int) - Method in class org.apache.spark.streaming.StateSpec: Set the number of partitions by which the state RDDs generated by mapWithState will be partitioned.
numRddBlocks() - Method in class org.apache.spark.storage.StorageStatus: Return the number of RDD blocks stored in this block manager in O(RDDs) time.
numRddBlocksById(int) - Method in class org.apache.spark.storage.StorageStatus: Return the number of blocks that belong to the given RDD in O(1) time.
numRecords() - Method in class org.apache.spark.streaming.scheduler.BatchInfo: The number of recorders received by the receivers in this batch.
numRecords() - Method in class org.apache.spark.streaming.scheduler.StreamInputInfo
numRetries(SparkConf) - Static method in class org.apache.spark.util.RpcUtils: Returns the configured number of times to retry connecting
numRowBlocks() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
numRows() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
numRows() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
numRows() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
numRows() - Method in interface org.apache.spark.mllib.linalg.distributed.DistributedMatrix: Gets or computes the number of rows.
numRows() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
numRows() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Gets or computes the number of rows.
numRows() - Method in interface org.apache.spark.mllib.linalg.Matrix: Number of rows.
numRows() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
numSkippedStages() - Method in class org.apache.spark.status.api.v1.JobData
numSkippedTasks() - Method in class org.apache.spark.status.api.v1.JobData
numSpilledStages() - Method in class org.apache.spark.SpillListener
numTasks() - Method in class org.apache.spark.scheduler.StageInfo
numTasks() - Method in interface org.apache.spark.SparkStageInfo
numTasks() - Method in class org.apache.spark.SparkStageInfoImpl
numTasks() - Method in class org.apache.spark.status.api.v1.JobData
numTopFeatures() - Method in class org.apache.spark.mllib.feature.ChiSqSelector
numValues() - Method in class org.apache.spark.ml.attribute.NominalAttribute
numVertices() - Method in class org.apache.spark.graphx.GraphOps: The number of vertices in the graph.

O

objectFile(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Load an RDD saved as a SequenceFile containing serialized objects, with NullWritable keys and BytesWritable values that contain a serialized partition.
objectFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Load an RDD saved as a SequenceFile containing serialized objects, with NullWritable keys and BytesWritable values that contain a serialized partition.
objectFile(String, int, ClassTag<T>) - Method in class org.apache.spark.SparkContext: Load an RDD saved as a SequenceFile containing serialized objects, with NullWritable keys and BytesWritable values that contain a serialized partition.
objectiveHistory() - Method in class org.apache.spark.ml.classification.BinaryLogisticRegressionTrainingSummary
objectiveHistory() - Method in interface org.apache.spark.ml.classification.LogisticRegressionTrainingSummary: objective function (scaled loss + regularization) at each iteration.
objectiveHistory() - Method in class org.apache.spark.ml.regression.LinearRegressionTrainingSummary
of(JavaRDD<Tuple2<T, T>>) - Static method in class org.apache.spark.mllib.evaluation.RankingMetrics: Creates a RankingMetrics instance (for Java users).
OFF_HEAP - Static variable in class org.apache.spark.api.java.StorageLevels
OFF_HEAP() - Static method in class org.apache.spark.storage.StorageLevel
offHeapUsed() - Method in class org.apache.spark.storage.StorageStatus: Return the off-heap space used by this block manager.
offHeapUsedByRdd(int) - Method in class org.apache.spark.storage.StorageStatus: Return the off-heap space used by the given RDD in this block manager in O(1) time.
OffsetRange - Class in org.apache.spark.streaming.kafka: Represents a range of offsets from a single Kafka TopicAndPartition.
offsetRanges() - Method in interface org.apache.spark.streaming.kafka.HasOffsetRanges
oldLocalModel() - Method in class org.apache.spark.ml.clustering.DistributedLDAModel
oldLocalModel() - Method in class org.apache.spark.ml.clustering.LDAModel: Underlying spark.mllib model.
oldLocalModel() - Method in class org.apache.spark.ml.clustering.LocalLDAModel
onApplicationEnd(SparkListenerApplicationEnd) - Method in class org.apache.spark.JavaSparkListener
onApplicationEnd(SparkListenerApplicationEnd) - Method in interface org.apache.spark.scheduler.SparkListener: Called when the application ends
onApplicationEnd(SparkListenerApplicationEnd) - Method in class org.apache.spark.SparkFirehoseListener
onApplicationEnd(SparkListenerApplicationEnd) - Method in class org.apache.spark.ui.jobs.JobProgressListener
onApplicationStart(SparkListenerApplicationStart) - Method in class org.apache.spark.JavaSparkListener
onApplicationStart(SparkListenerApplicationStart) - Method in interface org.apache.spark.scheduler.SparkListener: Called when the application starts
onApplicationStart(SparkListenerApplicationStart) - Method in class org.apache.spark.SparkFirehoseListener
onApplicationStart(SparkListenerApplicationStart) - Method in class org.apache.spark.ui.exec.ExecutorsListener
onApplicationStart(SparkListenerApplicationStart) - Method in class org.apache.spark.ui.jobs.JobProgressListener
onBatchCompleted(StreamingListenerBatchCompleted) - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
onBatchCompleted(StreamingListenerBatchCompleted) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener: Called when processing of a batch of jobs has completed.
onBatchStarted(StreamingListenerBatchStarted) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener: Called when processing of a batch of jobs has started.
onBatchSubmitted(StreamingListenerBatchSubmitted) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener: Called when a batch of jobs has been submitted for processing.
onBlockManagerAdded(SparkListenerBlockManagerAdded) - Method in class org.apache.spark.JavaSparkListener
onBlockManagerAdded(SparkListenerBlockManagerAdded) - Method in interface org.apache.spark.scheduler.SparkListener: Called when a new block manager has joined
onBlockManagerAdded(SparkListenerBlockManagerAdded) - Method in class org.apache.spark.SparkFirehoseListener
onBlockManagerAdded(SparkListenerBlockManagerAdded) - Method in class org.apache.spark.storage.StorageStatusListener
onBlockManagerAdded(SparkListenerBlockManagerAdded) - Method in class org.apache.spark.ui.jobs.JobProgressListener
onBlockManagerRemoved(SparkListenerBlockManagerRemoved) - Method in class org.apache.spark.JavaSparkListener
onBlockManagerRemoved(SparkListenerBlockManagerRemoved) - Method in interface org.apache.spark.scheduler.SparkListener: Called when an existing block manager has been removed
onBlockManagerRemoved(SparkListenerBlockManagerRemoved) - Method in class org.apache.spark.SparkFirehoseListener
onBlockManagerRemoved(SparkListenerBlockManagerRemoved) - Method in class org.apache.spark.storage.StorageStatusListener
onBlockManagerRemoved(SparkListenerBlockManagerRemoved) - Method in class org.apache.spark.ui.jobs.JobProgressListener
onBlockUpdated(SparkListenerBlockUpdated) - Method in class org.apache.spark.JavaSparkListener
onBlockUpdated(SparkListenerBlockUpdated) - Method in interface org.apache.spark.scheduler.SparkListener: Called when the driver receives a block update info.
onBlockUpdated(SparkListenerBlockUpdated) - Method in class org.apache.spark.SparkFirehoseListener
onComplete(Function1<Try<T>, U>, ExecutionContext) - Method in class org.apache.spark.ComplexFutureAction
onComplete(Function1<Try<T>, U>, ExecutionContext) - Method in interface org.apache.spark.FutureAction: When this action is completed, either through an exception, or a value, applies the provided function.
onComplete(Function1<R, BoxedUnit>) - Method in class org.apache.spark.partial.PartialResult: Set a handler to be called when this PartialResult completes.
onComplete(Function1<Try<T>, U>, ExecutionContext) - Method in class org.apache.spark.SimpleFutureAction
ONE() - Static method in class org.apache.spark.sql.types.Decimal
OneHotEncoder - Class in org.apache.spark.ml.feature: :: Experimental :: A one-hot encoder that maps a column of category indices to a column of binary vectors, with at most a single one-value per row that indicates the input category index.
OneHotEncoder(String) - Constructor for class org.apache.spark.ml.feature.OneHotEncoder
OneHotEncoder() - Constructor for class org.apache.spark.ml.feature.OneHotEncoder
onEnvironmentUpdate(SparkListenerEnvironmentUpdate) - Method in class org.apache.spark.JavaSparkListener
onEnvironmentUpdate(SparkListenerEnvironmentUpdate) - Method in interface org.apache.spark.scheduler.SparkListener: Called when environment properties have been updated
onEnvironmentUpdate(SparkListenerEnvironmentUpdate) - Method in class org.apache.spark.SparkFirehoseListener
onEnvironmentUpdate(SparkListenerEnvironmentUpdate) - Method in class org.apache.spark.ui.env.EnvironmentListener
onEnvironmentUpdate(SparkListenerEnvironmentUpdate) - Method in class org.apache.spark.ui.jobs.JobProgressListener
ones(int, int) - Static method in class org.apache.spark.mllib.linalg.DenseMatrix: Generate a DenseMatrix consisting of ones.
ones(int, int) - Static method in class org.apache.spark.mllib.linalg.Matrices: Generate a DenseMatrix consisting of ones.
ones(int) - Static method in class org.apache.spark.util.Vector
OneToOneDependency<T> - Class in org.apache.spark: :: DeveloperApi :: Represents a one-to-one dependency between partitions of the parent and child RDDs.
OneToOneDependency(RDD<T>) - Constructor for class org.apache.spark.OneToOneDependency
onEvent(SparkListenerEvent) - Method in class org.apache.spark.SparkFirehoseListener
OneVsRest - Class in org.apache.spark.ml.classification: :: Experimental ::
OneVsRest(String) - Constructor for class org.apache.spark.ml.classification.OneVsRest
OneVsRest() - Constructor for class org.apache.spark.ml.classification.OneVsRest
OneVsRestModel - Class in org.apache.spark.ml.classification: :: Experimental :: Model produced by OneVsRest.
onExecutorAdded(SparkListenerExecutorAdded) - Method in class org.apache.spark.JavaSparkListener
onExecutorAdded(SparkListenerExecutorAdded) - Method in interface org.apache.spark.scheduler.SparkListener: Called when the driver registers a new executor.
onExecutorAdded(SparkListenerExecutorAdded) - Method in class org.apache.spark.SparkFirehoseListener
onExecutorAdded(SparkListenerExecutorAdded) - Method in class org.apache.spark.ui.exec.ExecutorsListener
onExecutorMetricsUpdate(SparkListenerExecutorMetricsUpdate) - Method in class org.apache.spark.JavaSparkListener
onExecutorMetricsUpdate(SparkListenerExecutorMetricsUpdate) - Method in interface org.apache.spark.scheduler.SparkListener: Called when the driver receives task metrics from an executor in a heartbeat.
onExecutorMetricsUpdate(SparkListenerExecutorMetricsUpdate) - Method in class org.apache.spark.SparkFirehoseListener
onExecutorMetricsUpdate(SparkListenerExecutorMetricsUpdate) - Method in class org.apache.spark.ui.jobs.JobProgressListener
onExecutorRemoved(SparkListenerExecutorRemoved) - Method in class org.apache.spark.JavaSparkListener
onExecutorRemoved(SparkListenerExecutorRemoved) - Method in interface org.apache.spark.scheduler.SparkListener: Called when the driver removes an executor.
onExecutorRemoved(SparkListenerExecutorRemoved) - Method in class org.apache.spark.SparkFirehoseListener
onExecutorRemoved(SparkListenerExecutorRemoved) - Method in class org.apache.spark.ui.exec.ExecutorsListener
onFail(Function1<Exception, BoxedUnit>) - Method in class org.apache.spark.partial.PartialResult: Set a handler to be called if this PartialResult's job fails.
onFailure(String, QueryExecution, Exception) - Method in interface org.apache.spark.sql.util.QueryExecutionListener: A callback function that will be called when a query execution failed.
onJobEnd(SparkListenerJobEnd) - Method in class org.apache.spark.JavaSparkListener
onJobEnd(SparkListenerJobEnd) - Method in class org.apache.spark.scheduler.JobLogger: When job ends, recording job completion status and close log file
onJobEnd(SparkListenerJobEnd) - Method in interface org.apache.spark.scheduler.SparkListener: Called when a job ends
onJobEnd(SparkListenerJobEnd) - Method in class org.apache.spark.SparkFirehoseListener
onJobEnd(SparkListenerJobEnd) - Method in class org.apache.spark.ui.jobs.JobProgressListener
onJobStart(SparkListenerJobStart) - Method in class org.apache.spark.JavaSparkListener
onJobStart(SparkListenerJobStart) - Method in class org.apache.spark.scheduler.JobLogger: When job starts, record job property and stage graph
onJobStart(SparkListenerJobStart) - Method in interface org.apache.spark.scheduler.SparkListener: Called when a job starts
onJobStart(SparkListenerJobStart) - Method in class org.apache.spark.SparkFirehoseListener
onJobStart(SparkListenerJobStart) - Method in class org.apache.spark.ui.jobs.JobProgressListener
OnlineLDAOptimizer - Class in org.apache.spark.mllib.clustering: :: DeveloperApi ::
OnlineLDAOptimizer() - Constructor for class org.apache.spark.mllib.clustering.OnlineLDAOptimizer
onOutputOperationCompleted(StreamingListenerOutputOperationCompleted) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener: Called when processing of a job of a batch has completed.
onOutputOperationStarted(StreamingListenerOutputOperationStarted) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener: Called when processing of a job of a batch has started.
onReceiverError(StreamingListenerReceiverError) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener: Called when a receiver has reported an error
onReceiverStarted(StreamingListenerReceiverStarted) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener: Called when a receiver has been started
onReceiverStopped(StreamingListenerReceiverStopped) - Method in interface org.apache.spark.streaming.scheduler.StreamingListener: Called when a receiver has been stopped
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.JavaSparkListener
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.scheduler.JobLogger: When stage is completed, record stage completion status
onStageCompleted(SparkListenerStageCompleted) - Method in interface org.apache.spark.scheduler.SparkListener: Called when a stage completes successfully or fails, with information on the completed stage.
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.scheduler.StatsReportListener
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.SparkFirehoseListener
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.SpillListener
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.ui.jobs.JobProgressListener
onStageCompleted(SparkListenerStageCompleted) - Method in class org.apache.spark.ui.storage.StorageListener
onStageSubmitted(SparkListenerStageSubmitted) - Method in class org.apache.spark.JavaSparkListener
onStageSubmitted(SparkListenerStageSubmitted) - Method in class org.apache.spark.scheduler.JobLogger: When stage is submitted, record stage submit info
onStageSubmitted(SparkListenerStageSubmitted) - Method in interface org.apache.spark.scheduler.SparkListener: Called when a stage is submitted
onStageSubmitted(SparkListenerStageSubmitted) - Method in class org.apache.spark.SparkFirehoseListener
onStageSubmitted(SparkListenerStageSubmitted) - Method in class org.apache.spark.ui.jobs.JobProgressListener: For FIFO, all stages are contained by "default" pool but "default" pool here is meaningless
onStageSubmitted(SparkListenerStageSubmitted) - Method in class org.apache.spark.ui.storage.StorageListener
onStart() - Method in class org.apache.spark.streaming.receiver.Receiver: This method is called by the system when the receiver is started.
onStop() - Method in class org.apache.spark.streaming.receiver.Receiver: This method is called by the system when the receiver is stopped.
onSuccess(String, QueryExecution, long) - Method in interface org.apache.spark.sql.util.QueryExecutionListener: A callback function that will be called when a query executed successfully.
onTaskCompletion(TaskContext) - Method in interface org.apache.spark.util.TaskCompletionListener
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.JavaSparkListener
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.scheduler.JobLogger: When task ends, record task completion status and metrics
onTaskEnd(SparkListenerTaskEnd) - Method in interface org.apache.spark.scheduler.SparkListener: Called when a task ends
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.scheduler.StatsReportListener
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.SparkFirehoseListener
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.SpillListener
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.storage.StorageStatusListener
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.ui.exec.ExecutorsListener
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.ui.jobs.JobProgressListener
onTaskEnd(SparkListenerTaskEnd) - Method in class org.apache.spark.ui.storage.StorageListener: Assumes the storage status list is fully up-to-date.
onTaskGettingResult(SparkListenerTaskGettingResult) - Method in class org.apache.spark.JavaSparkListener
onTaskGettingResult(SparkListenerTaskGettingResult) - Method in interface org.apache.spark.scheduler.SparkListener: Called when a task begins remotely fetching its result (will not be called for tasks that do not need to fetch the result remotely).
onTaskGettingResult(SparkListenerTaskGettingResult) - Method in class org.apache.spark.SparkFirehoseListener
onTaskGettingResult(SparkListenerTaskGettingResult) - Method in class org.apache.spark.ui.jobs.JobProgressListener
onTaskStart(SparkListenerTaskStart) - Method in class org.apache.spark.JavaSparkListener
onTaskStart(SparkListenerTaskStart) - Method in interface org.apache.spark.scheduler.SparkListener: Called when a task starts
onTaskStart(SparkListenerTaskStart) - Method in class org.apache.spark.SparkFirehoseListener
onTaskStart(SparkListenerTaskStart) - Method in class org.apache.spark.ui.exec.ExecutorsListener
onTaskStart(SparkListenerTaskStart) - Method in class org.apache.spark.ui.jobs.JobProgressListener
onUnpersistRDD(SparkListenerUnpersistRDD) - Method in class org.apache.spark.JavaSparkListener
onUnpersistRDD(SparkListenerUnpersistRDD) - Method in interface org.apache.spark.scheduler.SparkListener: Called when an RDD is manually unpersisted by the application
onUnpersistRDD(SparkListenerUnpersistRDD) - Method in class org.apache.spark.SparkFirehoseListener
onUnpersistRDD(SparkListenerUnpersistRDD) - Method in class org.apache.spark.storage.StorageStatusListener
onUnpersistRDD(SparkListenerUnpersistRDD) - Method in class org.apache.spark.ui.storage.StorageListener
open() - Method in class org.apache.spark.input.PortableDataStream: Create a new DataInputStream from the split and context.
ops() - Method in class org.apache.spark.graphx.Graph: The associated GraphOps object.
optimize(RDD<Tuple2<Object, Vector>>, Vector) - Method in class org.apache.spark.mllib.optimization.GradientDescent: :: DeveloperApi :: Runs gradient descent on the given training data.
optimize(RDD<Tuple2<Object, Vector>>, Vector) - Method in class org.apache.spark.mllib.optimization.LBFGS
optimize(RDD<Tuple2<Object, Vector>>, Vector) - Method in interface org.apache.spark.mllib.optimization.Optimizer: Solve the provided convex optimization problem.
optimizer() - Method in class org.apache.spark.mllib.classification.LogisticRegressionWithLBFGS
optimizer() - Method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
optimizer() - Method in class org.apache.spark.mllib.classification.SVMWithSGD
Optimizer - Interface in org.apache.spark.mllib.optimization: :: DeveloperApi :: Trait for optimization problem solvers.
optimizer() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm: The optimizer to solve the problem.
optimizer() - Method in class org.apache.spark.mllib.regression.LassoWithSGD
optimizer() - Method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
optimizer() - Method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
optimizer() - Method in class org.apache.spark.sql.SQLContext
option(String, String) - Method in class org.apache.spark.sql.DataFrameReader: Adds an input option for the underlying data source.
option(String, String) - Method in class org.apache.spark.sql.DataFrameWriter: Adds an output option for the underlying data source.
options(Map<String, String>) - Method in class org.apache.spark.sql.DataFrameReader: (Scala-specific) Adds input options for the underlying data source.
options(Map<String, String>) - Method in class org.apache.spark.sql.DataFrameReader: Adds input options for the underlying data source.
options(Map<String, String>) - Method in class org.apache.spark.sql.DataFrameWriter: (Scala-specific) Adds output options for the underlying data source.
options(Map<String, String>) - Method in class org.apache.spark.sql.DataFrameWriter: Adds output options for the underlying data source.
or(Column) - Method in class org.apache.spark.sql.Column: Boolean OR.
Or - Class in org.apache.spark.sql.sources: A filter that evaluates to true iff at least one of left or right evaluates to true.
Or(Filter, Filter) - Constructor for class org.apache.spark.sql.sources.Or
OracleDialect - Class in org.apache.spark.sql.jdbc
OracleDialect() - Constructor for class org.apache.spark.sql.jdbc.OracleDialect
orc(String) - Method in class org.apache.spark.sql.DataFrameReader: Loads an ORC file and returns the result as a DataFrame.
orc(String) - Method in class org.apache.spark.sql.DataFrameWriter: Saves the content of the DataFrame in ORC format at the specified path.
orderBy(String, String...) - Method in class org.apache.spark.sql.DataFrame: Returns a new DataFrame sorted by the given expressions.
orderBy(Column...) - Method in class org.apache.spark.sql.DataFrame: Returns a new DataFrame sorted by the given expressions.
orderBy(String, Seq<String>) - Method in class org.apache.spark.sql.DataFrame: Returns a new DataFrame sorted by the given expressions.
orderBy(Seq<Column>) - Method in class org.apache.spark.sql.DataFrame: Returns a new DataFrame sorted by the given expressions.
orderBy(String, String...) - Static method in class org.apache.spark.sql.expressions.Window: Creates a WindowSpec with the ordering defined.
orderBy(Column...) - Static method in class org.apache.spark.sql.expressions.Window: Creates a WindowSpec with the ordering defined.
orderBy(String, Seq<String>) - Static method in class org.apache.spark.sql.expressions.Window: Creates a WindowSpec with the ordering defined.
orderBy(Seq<Column>) - Static method in class org.apache.spark.sql.expressions.Window: Creates a WindowSpec with the ordering defined.
orderBy(String, String...) - Method in class org.apache.spark.sql.expressions.WindowSpec: Defines the ordering columns in a WindowSpec.
orderBy(Column...) - Method in class org.apache.spark.sql.expressions.WindowSpec: Defines the ordering columns in a WindowSpec.
orderBy(String, Seq<String>) - Method in class org.apache.spark.sql.expressions.WindowSpec: Defines the ordering columns in a WindowSpec.
orderBy(Seq<Column>) - Method in class org.apache.spark.sql.expressions.WindowSpec: Defines the ordering columns in a WindowSpec.
OrderedRDDFunctions<K,V,P extends scala.Product2<K,V>> - Class in org.apache.spark.rdd: Extra functions available on RDDs of (key, value) pairs where the key is sortable through an implicit conversion.
OrderedRDDFunctions(RDD, Ordering<K>, ClassTag<K>, ClassTag<V>, ClassTag) - Constructor for class org.apache.spark.rdd.OrderedRDDFunctions
ordering() - Method in class org.apache.spark.sql.types.BinaryType
ordering() - Method in class org.apache.spark.sql.types.BooleanType
ordering() - Method in class org.apache.spark.sql.types.ByteType
ordering() - Method in class org.apache.spark.sql.types.DateType
ordering() - Method in class org.apache.spark.sql.types.DecimalType
ordering() - Method in class org.apache.spark.sql.types.DoubleType
ordering() - Method in class org.apache.spark.sql.types.FloatType
ordering() - Method in class org.apache.spark.sql.types.IntegerType
ordering() - Method in class org.apache.spark.sql.types.LongType
ordering() - Method in class org.apache.spark.sql.types.ShortType
ordering() - Method in class org.apache.spark.sql.types.StringType
ordering() - Method in class org.apache.spark.sql.types.TimestampType
ordering() - Static method in class org.apache.spark.streaming.Time
org.apache.spark - package org.apache.spark: Core Spark classes in Scala.
org.apache.spark.annotation - package org.apache.spark.annotation: Spark annotations to mark an API experimental or intended only for advanced usages by developers.
org.apache.spark.api.java - package org.apache.spark.api.java: Spark Java programming APIs.
org.apache.spark.api.java.function - package org.apache.spark.api.java.function: Set of interfaces to represent functions in Spark's Java API.
org.apache.spark.api.r - package org.apache.spark.api.r
org.apache.spark.broadcast - package org.apache.spark.broadcast: Spark's broadcast variables, used to broadcast immutable datasets to all nodes.
org.apache.spark.graphx - package org.apache.spark.graphx: ALPHA COMPONENT GraphX is a graph processing framework built on top of Spark.
org.apache.spark.graphx.impl - package org.apache.spark.graphx.impl
org.apache.spark.graphx.lib - package org.apache.spark.graphx.lib: Various analytics functions for graphs.
org.apache.spark.graphx.util - package org.apache.spark.graphx.util: Collections of utilities used by graphx.
org.apache.spark.input - package org.apache.spark.input
org.apache.spark.io - package org.apache.spark.io: IO codecs used for compression.
org.apache.spark.launcher - package org.apache.spark.launcher: Library for launching Spark applications.
org.apache.spark.ml - package org.apache.spark.ml: Spark ML is a BETA component that adds a new set of machine learning APIs to let users quickly assemble and configure practical machine learning pipelines.
org.apache.spark.ml.attribute - package org.apache.spark.ml.attribute: ML attributes
org.apache.spark.ml.classification - package org.apache.spark.ml.classification
org.apache.spark.ml.clustering - package org.apache.spark.ml.clustering
org.apache.spark.ml.evaluation - package org.apache.spark.ml.evaluation
org.apache.spark.ml.feature - package org.apache.spark.ml.feature: Feature transformers The `ml.feature` package provides common feature transformers that help convert raw data or features into more suitable forms for model fitting.
org.apache.spark.ml.param - package org.apache.spark.ml.param
org.apache.spark.ml.recommendation - package org.apache.spark.ml.recommendation
org.apache.spark.ml.regression - package org.apache.spark.ml.regression
org.apache.spark.ml.source.libsvm - package org.apache.spark.ml.source.libsvm
org.apache.spark.ml.tree - package org.apache.spark.ml.tree
org.apache.spark.ml.tuning - package org.apache.spark.ml.tuning
org.apache.spark.ml.util - package org.apache.spark.ml.util
org.apache.spark.mllib.classification - package org.apache.spark.mllib.classification
org.apache.spark.mllib.clustering - package org.apache.spark.mllib.clustering
org.apache.spark.mllib.evaluation - package org.apache.spark.mllib.evaluation
org.apache.spark.mllib.feature - package org.apache.spark.mllib.feature
org.apache.spark.mllib.fpm - package org.apache.spark.mllib.fpm
org.apache.spark.mllib.linalg - package org.apache.spark.mllib.linalg
org.apache.spark.mllib.linalg.distributed - package org.apache.spark.mllib.linalg.distributed
org.apache.spark.mllib.optimization - package org.apache.spark.mllib.optimization
org.apache.spark.mllib.pmml - package org.apache.spark.mllib.pmml
org.apache.spark.mllib.random - package org.apache.spark.mllib.random
org.apache.spark.mllib.rdd - package org.apache.spark.mllib.rdd
org.apache.spark.mllib.recommendation - package org.apache.spark.mllib.recommendation
org.apache.spark.mllib.regression - package org.apache.spark.mllib.regression
org.apache.spark.mllib.stat - package org.apache.spark.mllib.stat
org.apache.spark.mllib.stat.distribution - package org.apache.spark.mllib.stat.distribution
org.apache.spark.mllib.stat.test - package org.apache.spark.mllib.stat.test
org.apache.spark.mllib.tree - package org.apache.spark.mllib.tree
org.apache.spark.mllib.tree.configuration - package org.apache.spark.mllib.tree.configuration
org.apache.spark.mllib.tree.impurity - package org.apache.spark.mllib.tree.impurity
org.apache.spark.mllib.tree.loss - package org.apache.spark.mllib.tree.loss
org.apache.spark.mllib.tree.model - package org.apache.spark.mllib.tree.model
org.apache.spark.mllib.util - package org.apache.spark.mllib.util
org.apache.spark.partial - package org.apache.spark.partial
org.apache.spark.rdd - package org.apache.spark.rdd: Provides implementation's of various RDDs.
org.apache.spark.scheduler - package org.apache.spark.scheduler: Spark's DAG scheduler.
org.apache.spark.scheduler.cluster - package org.apache.spark.scheduler.cluster
org.apache.spark.scheduler.local - package org.apache.spark.scheduler.local
org.apache.spark.serializer - package org.apache.spark.serializer: Pluggable serializers for RDD and shuffle data.
org.apache.spark.sql - package org.apache.spark.sql
org.apache.spark.sql.api.java - package org.apache.spark.sql.api.java: Allows the execution of relational queries, including those expressed in SQL using Spark.
org.apache.spark.sql.expressions - package org.apache.spark.sql.expressions
org.apache.spark.sql.hive - package org.apache.spark.sql.hive
org.apache.spark.sql.hive.execution - package org.apache.spark.sql.hive.execution
org.apache.spark.sql.jdbc - package org.apache.spark.sql.jdbc
org.apache.spark.sql.sources - package org.apache.spark.sql.sources
org.apache.spark.sql.types - package org.apache.spark.sql.types
org.apache.spark.sql.util - package org.apache.spark.sql.util
org.apache.spark.status.api.v1 - package org.apache.spark.status.api.v1
org.apache.spark.storage - package org.apache.spark.storage
org.apache.spark.streaming - package org.apache.spark.streaming
org.apache.spark.streaming.api.java - package org.apache.spark.streaming.api.java: Java APIs for spark streaming.
org.apache.spark.streaming.dstream - package org.apache.spark.streaming.dstream: Various implementations of DStreams.
org.apache.spark.streaming.flume - package org.apache.spark.streaming.flume: Spark streaming receiver for Flume.
org.apache.spark.streaming.kafka - package org.apache.spark.streaming.kafka: Kafka receiver for spark streaming.
org.apache.spark.streaming.kinesis - package org.apache.spark.streaming.kinesis
org.apache.spark.streaming.mqtt - package org.apache.spark.streaming.mqtt: MQTT receiver for Spark Streaming.
org.apache.spark.streaming.receiver - package org.apache.spark.streaming.receiver
org.apache.spark.streaming.scheduler - package org.apache.spark.streaming.scheduler
org.apache.spark.streaming.twitter - package org.apache.spark.streaming.twitter: Twitter feed receiver for spark streaming.
org.apache.spark.streaming.util - package org.apache.spark.streaming.util
org.apache.spark.streaming.zeromq - package org.apache.spark.streaming.zeromq: Zeromq receiver for spark streaming.
org.apache.spark.ui.env - package org.apache.spark.ui.env
org.apache.spark.ui.exec - package org.apache.spark.ui.exec
org.apache.spark.ui.jobs - package org.apache.spark.ui.jobs
org.apache.spark.ui.storage - package org.apache.spark.ui.storage
org.apache.spark.util - package org.apache.spark.util: Spark utilities.
org.apache.spark.util.random - package org.apache.spark.util.random: Utilities for random number generation.
originalMax() - Method in class org.apache.spark.ml.feature.MinMaxScalerModel
originalMin() - Method in class org.apache.spark.ml.feature.MinMaxScalerModel
other() - Method in class org.apache.spark.scheduler.RuntimePercentage
otherInfo() - Method in class org.apache.spark.streaming.receiver.Statistics
otherVertexAttr(long) - Method in class org.apache.spark.graphx.EdgeTriplet: Given one vertex in the edge return the other vertex.
otherVertexId(long) - Method in class org.apache.spark.graphx.Edge: Given one vertex in the edge return the other vertex.
otherwise(Object) - Method in class org.apache.spark.sql.Column: Evaluates a list of conditions and returns one of multiple possible result expressions.
Out() - Static method in class org.apache.spark.graphx.EdgeDirection: Edges originating from a vertex.
outDegrees() - Method in class org.apache.spark.graphx.GraphOps: The out-degree of each vertex in the graph.
outerJoinVertices(RDD<Tuple2<Object, U>>, Function3<Object, VD, Option, VD2>, ClassTag, ClassTag<VD2>, Predef.$eq$colon$eq<VD, VD2>) - Method in class org.apache.spark.graphx.Graph: Joins the vertices with entries in the table RDD and merges the results using mapFunc.
outerJoinVertices(RDD<Tuple2<Object, U>>, Function3<Object, VD, Option, VD2>, ClassTag, ClassTag<VD2>, Predef.$eq$colon$eq<VD, VD2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
outputBytes() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
outputBytes() - Method in class org.apache.spark.status.api.v1.StageData
OutputCommitCoordinationMessage - Interface in org.apache.spark.scheduler
outputCommitCoordinator() - Method in class org.apache.spark.SparkEnv
outputDataType() - Method in class org.apache.spark.ml.feature.DCT
outputDataType() - Method in class org.apache.spark.ml.feature.ElementwiseProduct
outputDataType() - Method in class org.apache.spark.ml.feature.NGram
outputDataType() - Method in class org.apache.spark.ml.feature.Normalizer
outputDataType() - Method in class org.apache.spark.ml.feature.PolynomialExpansion
outputDataType() - Method in class org.apache.spark.ml.feature.RegexTokenizer
outputDataType() - Method in class org.apache.spark.ml.feature.Tokenizer
outputDataType() - Method in class org.apache.spark.ml.UnaryTransformer: Returns the data type of the output column.
OutputMetricDistributions - Class in org.apache.spark.status.api.v1
OutputMetrics - Class in org.apache.spark.status.api.v1
outputMetrics() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
outputMetrics() - Method in class org.apache.spark.status.api.v1.TaskMetrics
OutputOperationInfo - Class in org.apache.spark.streaming.scheduler: :: DeveloperApi :: Class having information on output operations.
OutputOperationInfo(Time, int, String, String, Option<Object>, Option<Object>, Option<String>) - Constructor for class org.apache.spark.streaming.scheduler.OutputOperationInfo
outputOperationInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerOutputOperationCompleted
outputOperationInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerOutputOperationStarted
outputOperationInfos() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
outputRecords() - Method in class org.apache.spark.status.api.v1.StageData
OutputWriter - Class in org.apache.spark.sql.sources: ::Experimental:: OutputWriter is used together with HadoopFsRelation for persisting rows to the underlying file system.
OutputWriter() - Constructor for class org.apache.spark.sql.sources.OutputWriter
OutputWriterFactory - Class in org.apache.spark.sql.sources: ::Experimental:: A factory that produces OutputWriters.
OutputWriterFactory() - Constructor for class org.apache.spark.sql.sources.OutputWriterFactory
over(WindowSpec) - Method in class org.apache.spark.sql.Column: Define a windowing column.
overwrite() - Method in class org.apache.spark.ml.util.MLWriter: Overwrites if the output path already exists.

P

p() - Method in class org.apache.spark.ml.feature.Normalizer: Normalization in L^p^ space.
pageRank(double, double) - Method in class org.apache.spark.graphx.GraphOps: Run a dynamic version of PageRank returning a graph with vertex attributes containing the PageRank and edge attributes containing the normalized edge weight.
PageRank - Class in org.apache.spark.graphx.lib: PageRank algorithm implementation.
PageRank() - Constructor for class org.apache.spark.graphx.lib.PageRank
PairDStreamFunctions<K,V> - Class in org.apache.spark.streaming.dstream: Extra functions available on DStream of (key, value) pairs through an implicit conversion.
PairDStreamFunctions(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Constructor for class org.apache.spark.streaming.dstream.PairDStreamFunctions
PairFlatMapFunction<T,K,V> - Interface in org.apache.spark.api.java.function: A function that returns zero or more key-value pair records from each input record.
PairFunction<T,K,V> - Interface in org.apache.spark.api.java.function: A function that returns key-value pairs (Tuple2<K, V>), and can be used to construct PairRDDs.
PairRDDFunctions<K,V> - Class in org.apache.spark.rdd: Extra functions available on RDDs of (key, value) pairs through an implicit conversion.
PairRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Constructor for class org.apache.spark.rdd.PairRDDFunctions
PairwiseRRDD<T> - Class in org.apache.spark.api.r: Form an RDD[(Int, Array[Byte])] from key-value pairs returned from R.
PairwiseRRDD(RDD<T>, int, byte[], String, byte[], Object[], ClassTag<T>) - Constructor for class org.apache.spark.api.r.PairwiseRRDD
parallelize(List<T>, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Distribute a local Scala collection to form an RDD.
parallelize(List<T>) - Method in class org.apache.spark.api.java.JavaSparkContext: Distribute a local Scala collection to form an RDD.
parallelize(Seq<T>, int, ClassTag<T>) - Method in class org.apache.spark.SparkContext: Distribute a local Scala collection to form an RDD.
parallelizeDoubles(List<Double>, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Distribute a local Scala collection to form an RDD.
parallelizeDoubles(List<Double>) - Method in class org.apache.spark.api.java.JavaSparkContext: Distribute a local Scala collection to form an RDD.
parallelizePairs(List<Tuple2<K, V>>, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Distribute a local Scala collection to form an RDD.
parallelizePairs(List<Tuple2<K, V>>) - Method in class org.apache.spark.api.java.JavaSparkContext: Distribute a local Scala collection to form an RDD.
Param<T> - Class in org.apache.spark.ml.param: :: DeveloperApi :: A param with self-contained documentation and optionally default value.
Param(String, String, String, Function1<T, Object>) - Constructor for class org.apache.spark.ml.param.Param
Param(Identifiable, String, String, Function1<T, Object>) - Constructor for class org.apache.spark.ml.param.Param
Param(String, String, String) - Constructor for class org.apache.spark.ml.param.Param
Param(Identifiable, String, String) - Constructor for class org.apache.spark.ml.param.Param
param() - Method in class org.apache.spark.ml.param.ParamPair
ParamGridBuilder - Class in org.apache.spark.ml.tuning: :: Experimental :: Builder for a param grid used in grid search-based model selection.
ParamGridBuilder() - Constructor for class org.apache.spark.ml.tuning.ParamGridBuilder
ParamMap - Class in org.apache.spark.ml.param: :: Experimental :: A param to value map.
ParamMap() - Constructor for class org.apache.spark.ml.param.ParamMap: Creates an empty param map.
paramMap() - Method in interface org.apache.spark.ml.param.Params: Internal param map for user-supplied values.
ParamPair<T> - Class in org.apache.spark.ml.param: :: Experimental :: A param and its value.
ParamPair(Param<T>, T) - Constructor for class org.apache.spark.ml.param.ParamPair
Params - Interface in org.apache.spark.ml.param
params() - Method in interface org.apache.spark.ml.param.Params
ParamValidators - Class in org.apache.spark.ml.param: :: DeveloperApi :: Factory methods for common validation functions for Param.isValid.
ParamValidators() - Constructor for class org.apache.spark.ml.param.ParamValidators
parent() - Method in class org.apache.spark.ml.Model: The parent estimator that produced this model.
parent() - Method in class org.apache.spark.ml.param.Param
parent(int, ClassTag) - Method in class org.apache.spark.rdd.RDD: Returns the jth parent RDD: e.g.
parentIds() - Method in class org.apache.spark.scheduler.StageInfo
parentIds() - Method in class org.apache.spark.storage.RDDInfo
parentIndex(int) - Static method in class org.apache.spark.mllib.tree.model.Node: Get the parent index of the given node, or 0 if it is the root.
parquet(String...) - Method in class org.apache.spark.sql.DataFrameReader: Loads a Parquet file, returning the result as a DataFrame.
parquet(Seq<String>) - Method in class org.apache.spark.sql.DataFrameReader: Loads a Parquet file, returning the result as a DataFrame.
parquet(String) - Method in class org.apache.spark.sql.DataFrameWriter: Saves the content of the DataFrame in Parquet format at the specified path.
parquetFile(String...) - Method in class org.apache.spark.sql.SQLContext: Deprecated.
As of 1.4.0, replaced by read().parquet(). This will be removed in Spark 2.0.
parquetFile(Seq<String>) - Method in class org.apache.spark.sql.SQLContext
parse(String) - Static method in class org.apache.spark.mllib.linalg.Vectors: Parses a string resulted from Vector.toString into a Vector.
parse(String) - Static method in class org.apache.spark.mllib.regression.LabeledPoint
parseDataType(String) - Method in class org.apache.spark.sql.SQLContext
parseIgnoreCase(Class<E>, String) - Static method in class org.apache.spark.util.EnumUtil
parseSql(String) - Method in class org.apache.spark.sql.hive.HiveContext
parseSql(String) - Method in class org.apache.spark.sql.SQLContext
PartialResult<R> - Class in org.apache.spark.partial
PartialResult(R, boolean) - Constructor for class org.apache.spark.partial.PartialResult
Partition - Interface in org.apache.spark: An identifier for a partition in an RDD.
partition() - Method in class org.apache.spark.scheduler.AskPermissionToCommitOutput
partition() - Method in class org.apache.spark.streaming.kafka.OffsetRange
partitionBy(Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a copy of the RDD partitioned using the specified partitioner.
partitionBy(PartitionStrategy) - Method in class org.apache.spark.graphx.Graph: Repartitions the edges in the graph according to partitionStrategy.
partitionBy(PartitionStrategy, int) - Method in class org.apache.spark.graphx.Graph: Repartitions the edges in the graph according to partitionStrategy.
partitionBy(PartitionStrategy) - Method in class org.apache.spark.graphx.impl.GraphImpl
partitionBy(PartitionStrategy, int) - Method in class org.apache.spark.graphx.impl.GraphImpl
partitionBy(Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return a copy of the RDD partitioned using the specified partitioner.
partitionBy(String...) - Method in class org.apache.spark.sql.DataFrameWriter: Partitions the output by the given columns on the file system.
partitionBy(Seq<String>) - Method in class org.apache.spark.sql.DataFrameWriter: Partitions the output by the given columns on the file system.
partitionBy(String, String...) - Static method in class org.apache.spark.sql.expressions.Window: Creates a WindowSpec with the partitioning defined.
partitionBy(Column...) - Static method in class org.apache.spark.sql.expressions.Window: Creates a WindowSpec with the partitioning defined.
partitionBy(String, Seq<String>) - Static method in class org.apache.spark.sql.expressions.Window: Creates a WindowSpec with the partitioning defined.
partitionBy(Seq<Column>) - Static method in class org.apache.spark.sql.expressions.Window: Creates a WindowSpec with the partitioning defined.
partitionBy(String, String...) - Method in class org.apache.spark.sql.expressions.WindowSpec: Defines the partitioning columns in a WindowSpec.
partitionBy(Column...) - Method in class org.apache.spark.sql.expressions.WindowSpec: Defines the partitioning columns in a WindowSpec.
partitionBy(String, Seq<String>) - Method in class org.apache.spark.sql.expressions.WindowSpec: Defines the partitioning columns in a WindowSpec.
partitionBy(Seq<Column>) - Method in class org.apache.spark.sql.expressions.WindowSpec: Defines the partitioning columns in a WindowSpec.
PartitionCoalescer - Class in org.apache.spark.rdd: Coalesce the partitions of a parent RDD (prev) into fewer partitions, so that each partition of this RDD computes one or more of the parent ones.
PartitionCoalescer(int, RDD<?>, double) - Constructor for class org.apache.spark.rdd.PartitionCoalescer
PartitionCoalescer.LocationIterator - Class in org.apache.spark.rdd
PartitionCoalescer.LocationIterator(RDD<?>) - Constructor for class org.apache.spark.rdd.PartitionCoalescer.LocationIterator
partitionColumns() - Method in class org.apache.spark.sql.sources.HadoopFsRelation: Partition columns.
partitioner() - Method in interface org.apache.spark.api.java.JavaRDDLike: The partitioner of this RDD.
partitioner() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl: If partitionsRDD already has a partitioner, use it.
partitioner() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
Partitioner - Class in org.apache.spark: An object that defines how the elements in a key-value pair RDD are partitioned by key.
Partitioner() - Constructor for class org.apache.spark.Partitioner
partitioner() - Method in class org.apache.spark.rdd.CoGroupedRDD
partitioner() - Method in class org.apache.spark.rdd.RDD: Optionally overridden by subclasses to specify how they are partitioned.
partitioner() - Method in class org.apache.spark.rdd.ShuffledRDD
partitioner() - Method in class org.apache.spark.ShuffleDependency
partitioner(Partitioner) - Method in class org.apache.spark.streaming.StateSpec: Set the partitioner by which the state RDDs generated by mapWithState will be be partitioned.
PartitionGroup - Class in org.apache.spark.rdd
PartitionGroup(Option<String>) - Constructor for class org.apache.spark.rdd.PartitionGroup
partitionID() - Method in class org.apache.spark.TaskCommitDenied
partitionId() - Method in class org.apache.spark.TaskContext: The ID of the RDD partition that is computed by this task.
PartitionPruningRDD<T> - Class in org.apache.spark.rdd: :: DeveloperApi :: A RDD used to prune RDD partitions/partitions so we can avoid launching tasks on all partitions.
PartitionPruningRDD(RDD<T>, Function1<Object, Object>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.PartitionPruningRDD
partitions() - Method in interface org.apache.spark.api.java.JavaRDDLike: Set of partitions in this RDD.
partitions() - Method in class org.apache.spark.rdd.RDD: Get the array of partitions of this RDD, taking into account whether the RDD is checkpointed or not.
partitions() - Method in class org.apache.spark.status.api.v1.RDDStorageInfo
partitionsRDD() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
partitionsRDD() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
PartitionStrategy - Interface in org.apache.spark.graphx: Represents the way edges are assigned to edge partitions based on their source and destination vertex IDs.
PartitionStrategy.CanonicalRandomVertexCut$ - Class in org.apache.spark.graphx: Assigns edges to partitions by hashing the source and destination vertex IDs in a canonical direction, resulting in a random vertex cut that colocates all edges between two vertices, regardless of direction.
PartitionStrategy.CanonicalRandomVertexCut$() - Constructor for class org.apache.spark.graphx.PartitionStrategy.CanonicalRandomVertexCut$
PartitionStrategy.EdgePartition1D$ - Class in org.apache.spark.graphx: Assigns edges to partitions using only the source vertex ID, colocating edges with the same source.
PartitionStrategy.EdgePartition1D$() - Constructor for class org.apache.spark.graphx.PartitionStrategy.EdgePartition1D$
PartitionStrategy.EdgePartition2D$ - Class in org.apache.spark.graphx: Assigns edges to partitions using a 2D partitioning of the sparse edge adjacency matrix, guaranteeing a 2 * sqrt(numParts) bound on vertex replication.
PartitionStrategy.EdgePartition2D$() - Constructor for class org.apache.spark.graphx.PartitionStrategy.EdgePartition2D$
PartitionStrategy.RandomVertexCut$ - Class in org.apache.spark.graphx: Assigns edges to partitions by hashing the source and destination vertex IDs, resulting in a random vertex cut that colocates all same-direction edges between two vertices.
PartitionStrategy.RandomVertexCut$() - Constructor for class org.apache.spark.graphx.PartitionStrategy.RandomVertexCut$
path() - Method in class org.apache.spark.scheduler.InputFormatInfo
path() - Method in class org.apache.spark.scheduler.SplitInfo
path() - Method in class org.apache.spark.sql.sources.HadoopFsRelation.FakeFileStatus
paths() - Method in class org.apache.spark.sql.sources.HadoopFsRelation: Paths of this relation.
pattern() - Method in class org.apache.spark.ml.feature.RegexTokenizer: Regex pattern used to match delimiters if gaps is true or tokens if gaps is false.
pc() - Method in class org.apache.spark.ml.feature.PCAModel
pc() - Method in class org.apache.spark.mllib.feature.PCAModel
PCA - Class in org.apache.spark.ml.feature: :: Experimental :: PCA trains a model to project vectors to a low-dimensional space using PCA.
PCA(String) - Constructor for class org.apache.spark.ml.feature.PCA
PCA() - Constructor for class org.apache.spark.ml.feature.PCA
PCA - Class in org.apache.spark.mllib.feature: A feature transformer that projects vectors to a low-dimensional space using PCA.
PCA(int) - Constructor for class org.apache.spark.mllib.feature.PCA
PCAModel - Class in org.apache.spark.ml.feature
PCAModel - Class in org.apache.spark.mllib.feature: Model fitted by PCA that can project vectors to a low-dimensional space using PCA.
pdf(Vector) - Method in class org.apache.spark.mllib.stat.distribution.MultivariateGaussian: Returns density of this multivariate Gaussian at given point, x
pendingStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
percent_rank() - Static method in class org.apache.spark.sql.functions: Window function: returns the relative rank (i.e.
percentiles() - Static method in class org.apache.spark.scheduler.StatsReportListener
percentilesHeader() - Static method in class org.apache.spark.scheduler.StatsReportListener
percentRank() - Static method in class org.apache.spark.sql.functions: Deprecated.
As of 1.6.0, replaced by percent_rank. This will be removed in Spark 2.0.
persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Set this RDD's storage level to persist its values across operations after the first time it is computed.
persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaPairRDD: Set this RDD's storage level to persist its values across operations after the first time it is computed.
persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaRDD: Set this RDD's storage level to persist its values across operations after the first time it is computed.
persist(StorageLevel) - Method in class org.apache.spark.graphx.Graph: Caches the vertices and edges associated with this graph at the specified storage level, ignoring any target storage levels previously set.
persist(StorageLevel) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl: Persists the edge partitions at the specified storage level, ignoring any existing target storage level.
persist(StorageLevel) - Method in class org.apache.spark.graphx.impl.GraphImpl
persist(StorageLevel) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl: Persists the vertex partitions at the specified storage level, ignoring any existing target storage level.
persist(StorageLevel) - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix: Persists the underlying RDD with the specified storage level.
persist(StorageLevel) - Method in class org.apache.spark.rdd.HadoopRDD
persist(StorageLevel) - Method in class org.apache.spark.rdd.NewHadoopRDD
persist(StorageLevel) - Method in class org.apache.spark.rdd.RDD: Set this RDD's storage level to persist its values across operations after the first time it is computed.
persist() - Method in class org.apache.spark.rdd.RDD: Persist this RDD with the default storage level (`MEMORY_ONLY`).
persist() - Method in class org.apache.spark.sql.DataFrame: Persist this DataFrame with the default storage level (MEMORY_AND_DISK).
persist(StorageLevel) - Method in class org.apache.spark.sql.DataFrame: Persist this DataFrame with the given storage level.
persist() - Method in class org.apache.spark.sql.Dataset: Persist this Dataset with the default storage level (MEMORY_AND_DISK).
persist(StorageLevel) - Method in class org.apache.spark.sql.Dataset: Persist this Dataset with the given storage level.
persist() - Method in class org.apache.spark.streaming.api.java.JavaDStream: Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
persist(StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaDStream: Persist the RDDs of this DStream with the given storage level
persist() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
persist(StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Persist the RDDs of this DStream with the given storage level
persist(StorageLevel) - Method in class org.apache.spark.streaming.dstream.DStream: Persist the RDDs of this DStream with the given storage level
persist() - Method in class org.apache.spark.streaming.dstream.DStream: Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
persistentRdds() - Method in class org.apache.spark.SparkContext
personalizedPageRank(long, double, double) - Method in class org.apache.spark.graphx.GraphOps: Run personalized PageRank for a given vertex, such that all random walks are started relative to the source node.
pi() - Method in class org.apache.spark.ml.classification.NaiveBayesModel
pi() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
pickBin(Partition) - Method in class org.apache.spark.rdd.PartitionCoalescer: Takes a parent RDD partition and decides which of the partition groups to put it in Takes locality into account, but also uses power of 2 choices to load balance It strikes a balance between the two use the balanceSlack variable
pickRandomVertex() - Method in class org.apache.spark.graphx.GraphOps: Picks a random vertex from the graph and returns its ID.
pipe(String) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an RDD created by piping elements to a forked external process.
pipe(List<String>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an RDD created by piping elements to a forked external process.
pipe(List<String>, Map<String, String>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an RDD created by piping elements to a forked external process.
pipe(String) - Method in class org.apache.spark.rdd.RDD: Return an RDD created by piping elements to a forked external process.
pipe(String, Map<String, String>) - Method in class org.apache.spark.rdd.RDD: Return an RDD created by piping elements to a forked external process.
pipe(Seq<String>, Map<String, String>, Function1<Function1<String, BoxedUnit>, BoxedUnit>, Function2<T, Function1<String, BoxedUnit>, BoxedUnit>, boolean) - Method in class org.apache.spark.rdd.RDD: Return an RDD created by piping elements to a forked external process.
Pipeline - Class in org.apache.spark.ml: :: Experimental :: A simple pipeline, which acts as an estimator.
Pipeline(String) - Constructor for class org.apache.spark.ml.Pipeline
Pipeline() - Constructor for class org.apache.spark.ml.Pipeline
PipelineModel - Class in org.apache.spark.ml: :: Experimental :: Represents a fitted pipeline.
PipelineStage - Class in org.apache.spark.ml: :: DeveloperApi :: A stage in a pipeline, either an Estimator or a Transformer.
PipelineStage() - Constructor for class org.apache.spark.ml.PipelineStage
pivot(String) - Method in class org.apache.spark.sql.GroupedData: Pivots a column of the current DataFrame and perform the specified aggregation.
pivot(String, Seq<Object>) - Method in class org.apache.spark.sql.GroupedData: Pivots a column of the current DataFrame and perform the specified aggregation.
pivot(String, List<Object>) - Method in class org.apache.spark.sql.GroupedData: Pivots a column of the current DataFrame and perform the specified aggregation.
planner() - Method in class org.apache.spark.sql.hive.HiveContext
planner() - Method in class org.apache.spark.sql.SQLContext
plus(Object) - Method in class org.apache.spark.sql.Column: Sum of this expression and another expression.
plus(Duration) - Method in class org.apache.spark.streaming.Duration
plus(Duration) - Method in class org.apache.spark.streaming.Time
plusDot(Vector, Vector) - Method in class org.apache.spark.util.Vector: return (this + plus) dot other, but without creating any intermediate storage
PMMLExportable - Interface in org.apache.spark.mllib.pmml: :: DeveloperApi :: Export model to the PMML format Predictive Model Markup Language (PMML) is an XML-based file format developed by the Data Mining Group (www.dmg.org).
pmod(Column, Column) - Static method in class org.apache.spark.sql.functions: Returns the positive value of dividend mod divisor.
point() - Method in class org.apache.spark.mllib.feature.VocabWord
POINTS() - Static method in class org.apache.spark.mllib.clustering.StreamingKMeans
PoissonGenerator - Class in org.apache.spark.mllib.random: :: DeveloperApi :: Generates i.i.d.
PoissonGenerator(double) - Constructor for class org.apache.spark.mllib.random.PoissonGenerator
poissonJavaRDD(JavaSparkContext, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Java-friendly version of RandomRDDs.poissonRDD(org.apache.spark.SparkContext, double, long, int, long).
poissonJavaRDD(JavaSparkContext, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.poissonJavaRDD(org.apache.spark.api.java.JavaSparkContext, double, long, int, long) with the default seed.
poissonJavaRDD(JavaSparkContext, double, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.poissonJavaRDD(org.apache.spark.api.java.JavaSparkContext, double, long, int, long) with the default number of partitions and the default seed.
poissonJavaVectorRDD(JavaSparkContext, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Java-friendly version of RandomRDDs.poissonVectorRDD(org.apache.spark.SparkContext, double, long, int, int, long).
poissonJavaVectorRDD(JavaSparkContext, double, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.poissonJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, double, long, int, int, long) with the default seed.
poissonJavaVectorRDD(JavaSparkContext, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.poissonJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, double, long, int, int, long) with the default number of partitions and the default seed.
poissonRDD(SparkContext, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Generates an RDD comprised of i.i.d. samples from the Poisson distribution with the input mean.
PoissonSampler<T> - Class in org.apache.spark.util.random: :: DeveloperApi :: A sampler for sampling with replacement, based on values drawn from Poisson distribution.
PoissonSampler(double, boolean, ClassTag<T>) - Constructor for class org.apache.spark.util.random.PoissonSampler
PoissonSampler(double, ClassTag<T>) - Constructor for class org.apache.spark.util.random.PoissonSampler
poissonVectorRDD(SparkContext, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Generates an RDD[Vector] with vectors containing i.i.d. samples drawn from the Poisson distribution with the input mean.
PolynomialExpansion - Class in org.apache.spark.ml.feature: :: Experimental :: Perform feature expansion in a polynomial space.
PolynomialExpansion(String) - Constructor for class org.apache.spark.ml.feature.PolynomialExpansion
PolynomialExpansion() - Constructor for class org.apache.spark.ml.feature.PolynomialExpansion
poolToActiveStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
port() - Method in class org.apache.spark.storage.BlockManagerId
port() - Method in class org.apache.spark.streaming.kafka.Broker: Broker's port
PortableDataStream - Class in org.apache.spark.input: A class that allows DataStreams to be serialized and moved around by not creating them until they need to be read
PortableDataStream(CombineFileSplit, TaskAttemptContext, Integer) - Constructor for class org.apache.spark.input.PortableDataStream
PostgresDialect - Class in org.apache.spark.sql.jdbc
PostgresDialect() - Constructor for class org.apache.spark.sql.jdbc.PostgresDialect
pow(Column, Column) - Static method in class org.apache.spark.sql.functions: Returns the value of the first argument raised to the power of the second argument.
pow(Column, String) - Static method in class org.apache.spark.sql.functions: Returns the value of the first argument raised to the power of the second argument.
pow(String, Column) - Static method in class org.apache.spark.sql.functions: Returns the value of the first argument raised to the power of the second argument.
pow(String, String) - Static method in class org.apache.spark.sql.functions: Returns the value of the first argument raised to the power of the second argument.
pow(Column, double) - Static method in class org.apache.spark.sql.functions: Returns the value of the first argument raised to the power of the second argument.
pow(String, double) - Static method in class org.apache.spark.sql.functions: Returns the value of the first argument raised to the power of the second argument.
pow(double, Column) - Static method in class org.apache.spark.sql.functions: Returns the value of the first argument raised to the power of the second argument.
pow(double, String) - Static method in class org.apache.spark.sql.functions: Returns the value of the first argument raised to the power of the second argument.
PowerIterationClustering - Class in org.apache.spark.mllib.clustering
PowerIterationClustering() - Constructor for class org.apache.spark.mllib.clustering.PowerIterationClustering
PowerIterationClustering.Assignment - Class in org.apache.spark.mllib.clustering: Cluster assignment.
PowerIterationClustering.Assignment(long, int) - Constructor for class org.apache.spark.mllib.clustering.PowerIterationClustering.Assignment
PowerIterationClustering.Assignment$ - Class in org.apache.spark.mllib.clustering
PowerIterationClustering.Assignment$() - Constructor for class org.apache.spark.mllib.clustering.PowerIterationClustering.Assignment$
PowerIterationClusteringModel - Class in org.apache.spark.mllib.clustering: Model produced by PowerIterationClustering.
PowerIterationClusteringModel(int, RDD<PowerIterationClustering.Assignment>) - Constructor for class org.apache.spark.mllib.clustering.PowerIterationClusteringModel
pr() - Method in class org.apache.spark.ml.classification.BinaryLogisticRegressionSummary: Returns the precision-recall curve, which is an Dataframe containing two fields recall, precision with (0.0, 1.0) prepended to it.
pr() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Returns the precision-recall curve, which is an RDD of (recall, precision), NOT (precision, recall), with (0.0, 1.0) prepended to it.
precision(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns precision for a given label (category)
precision() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns precision
precision() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics: Returns document-based precision averaged by the number of documents
precision(double) - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics: Returns precision for a given label (category)
precision() - Method in class org.apache.spark.sql.types.Decimal
precision() - Method in class org.apache.spark.sql.types.DecimalType
precision() - Method in class org.apache.spark.sql.types.PrecisionInfo
precisionAt(int) - Method in class org.apache.spark.mllib.evaluation.RankingMetrics: Compute the average precision of all the queries, truncated at ranking position k.
precisionByThreshold() - Method in class org.apache.spark.ml.classification.BinaryLogisticRegressionSummary: Returns a dataframe with two fields (threshold, precision) curve.
precisionByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Returns the (threshold, precision) curve.
precisionInfo() - Method in class org.apache.spark.sql.types.DecimalType
PrecisionInfo - Class in org.apache.spark.sql.types: Precision parameters for a Decimal
PrecisionInfo(int, int) - Constructor for class org.apache.spark.sql.types.PrecisionInfo
predict(FeaturesType) - Method in class org.apache.spark.ml.classification.ClassificationModel: Predict label for the given features.
predict(Vector) - Method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
predict(Vector) - Method in class org.apache.spark.ml.classification.GBTClassificationModel
predict(Vector) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel: Predict label for the given feature vector.
predict(Vector) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassificationModel: Predict label for the given features.
predict(FeaturesType) - Method in class org.apache.spark.ml.PredictionModel: Predict label for the given features.
predict(Vector) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
predict(Vector) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
predict(Vector) - Method in class org.apache.spark.ml.regression.GBTRegressionModel
predict(Vector) - Method in class org.apache.spark.ml.regression.LinearRegressionModel
predict(Vector) - Method in class org.apache.spark.ml.regression.RandomForestRegressionModel
predict(RDD<Vector>) - Method in interface org.apache.spark.mllib.classification.ClassificationModel: Predict values for the given data set using the model trained.
predict(Vector) - Method in interface org.apache.spark.mllib.classification.ClassificationModel: Predict values for a single data point using the model trained.
predict(JavaRDD<Vector>) - Method in interface org.apache.spark.mllib.classification.ClassificationModel: Predict values for examples stored in a JavaRDD.
predict(RDD<Vector>) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
predict(Vector) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
predict(Vector) - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel: Predicts the index of the cluster that the input point belongs to.
predict(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel: Predicts the indices of the clusters that the input points belong to.
predict(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel: Java-friendly version of predict().
predict(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel: Maps given points to their cluster indices.
predict(Vector) - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel: Maps given point to its cluster index.
predict(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel: Java-friendly version of predict()
predict(Vector) - Method in class org.apache.spark.mllib.clustering.KMeansModel: Returns the cluster index that a given point belongs to.
predict(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel: Maps given points to their cluster indices.
predict(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel: Maps given points to their cluster indices.
predict(int, int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel: Predict the rating of one user for one product.
predict(RDD<Tuple2<Object, Object>>) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel: Predict the rating of many users for many products.
predict(JavaPairRDD<Integer, Integer>) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel: Java-friendly version of MatrixFactorizationModel.predict.
predict(RDD<Vector>) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel: Predict values for the given data set using the model trained.
predict(Vector) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel: Predict values for a single data point using the model trained.
predict(RDD<Object>) - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel: Predict labels for provided features.
predict(JavaDoubleRDD) - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel: Predict labels for provided features.
predict(double) - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel: Predict a single label.
predict(RDD<Vector>) - Method in interface org.apache.spark.mllib.regression.RegressionModel: Predict values for the given data set using the model trained.
predict(Vector) - Method in interface org.apache.spark.mllib.regression.RegressionModel: Predict values for a single data point using the model trained.
predict(JavaRDD<Vector>) - Method in interface org.apache.spark.mllib.regression.RegressionModel: Predict values for examples stored in a JavaRDD.
predict(Vector) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel: Predict values for a single data point using the model trained.
predict(RDD<Vector>) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel: Predict values for the given data set using the model trained.
predict(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel: Predict values for the given data set using the model trained.
predict() - Method in class org.apache.spark.mllib.tree.model.Node
predict(Vector) - Method in class org.apache.spark.mllib.tree.model.Node: predict value if node is not leaf
Predict - Class in org.apache.spark.mllib.tree.model: Predicted value for a node param: predict predicted value param: prob probability of the label (classification only)
Predict(double, double) - Constructor for class org.apache.spark.mllib.tree.model.Predict
predict() - Method in class org.apache.spark.mllib.tree.model.Predict
prediction() - Method in class org.apache.spark.ml.tree.InternalNode
prediction() - Method in class org.apache.spark.ml.tree.LeafNode
prediction() - Method in class org.apache.spark.ml.tree.Node: Prediction a leaf node makes, or which an internal node would make if it were a leaf node
predictionCol() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary
PredictionModel<FeaturesType,M extends PredictionModel<FeaturesType,M>> - Class in org.apache.spark.ml: :: DeveloperApi :: Abstraction for a model for prediction tasks (regression and classification).
PredictionModel() - Constructor for class org.apache.spark.ml.PredictionModel
predictions() - Method in class org.apache.spark.ml.classification.BinaryLogisticRegressionSummary
predictions() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary: Dataframe outputted by the model's `transform` method.
predictions() - Method in class org.apache.spark.ml.regression.IsotonicRegressionModel: Predictions associated with the boundaries at the same index, monotone because of isotonic regression.
predictions() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary
predictions() - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
predictOn(DStream<Vector>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans: Use the clustering model to make predictions on batches of data from a DStream.
predictOn(JavaDStream<Vector>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans: Java-friendly version of predictOn.
predictOn(DStream<Vector>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm: Use the model to make predictions on batches of data from a DStream
predictOn(JavaDStream<Vector>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm: Java-friendly version of predictOn.
predictOnValues(DStream<Tuple2<K, Vector>>, ClassTag<K>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans: Use the model to make predictions on the values of a DStream and carry over its keys.
predictOnValues(JavaPairDStream<K, Vector>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans: Java-friendly version of predictOnValues.
predictOnValues(DStream<Tuple2<K, Vector>>, ClassTag<K>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm: Use the model to make predictions on the values of a DStream and carry over its keys.
predictOnValues(JavaPairDStream<K, Vector>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm: Java-friendly version of predictOnValues.
Predictor<FeaturesType,Learner extends Predictor<FeaturesType,Learner,M>,M extends PredictionModel<FeaturesType,M>> - Class in org.apache.spark.ml: :: DeveloperApi :: Abstraction for prediction problems (regression and classification).
Predictor() - Constructor for class org.apache.spark.ml.Predictor
predictPoint(Vector, Vector, double) - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
predictPoint(Vector, Vector, double) - Method in class org.apache.spark.mllib.classification.SVMModel
predictPoint(Vector, Vector, double) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel: Predict the result given a data point and the weights learned.
predictPoint(Vector, Vector, double) - Method in class org.apache.spark.mllib.regression.LassoModel
predictPoint(Vector, Vector, double) - Method in class org.apache.spark.mllib.regression.LinearRegressionModel
predictPoint(Vector, Vector, double) - Method in class org.apache.spark.mllib.regression.RidgeRegressionModel
predictProbabilities(RDD<Vector>) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel: Predict values for the given data set using the model trained.
predictProbabilities(Vector) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel: Predict posterior class probabilities for a single data point using the model trained.
predictProbability(FeaturesType) - Method in class org.apache.spark.ml.classification.ProbabilisticClassificationModel: Predict the probability of each class given the features.
predictQuantiles(Vector) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
predictRaw(FeaturesType) - Method in class org.apache.spark.ml.classification.ClassificationModel: Raw prediction for each possible label.
predictRaw(Vector) - Method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
predictRaw(Vector) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
predictRaw(Vector) - Method in class org.apache.spark.ml.classification.NaiveBayesModel
predictRaw(Vector) - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel
predictSoft(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel: Given the input vectors, return the membership value of each vector to all mixture components.
predictSoft(Vector) - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel: Given the input vector, return the membership values to all mixture components.
preferredLocation() - Method in class org.apache.spark.streaming.receiver.Receiver: Override this to specify a preferred location (hostname).
preferredLocations(Partition) - Method in class org.apache.spark.rdd.RDD: Get the preferred locations of a partition, taking into account whether the RDD is checkpointed.
PrefixSpan - Class in org.apache.spark.mllib.fpm: :: Experimental ::
PrefixSpan() - Constructor for class org.apache.spark.mllib.fpm.PrefixSpan: Constructs a default instance with default parameters {minSupport: 0.1, maxPatternLength: 10, maxLocalProjDBSize: 32000000L}.
PrefixSpan.FreqSequence<Item> - Class in org.apache.spark.mllib.fpm: Represents a frequence sequence.
PrefixSpan.FreqSequence(Object[], long) - Constructor for class org.apache.spark.mllib.fpm.PrefixSpan.FreqSequence
PrefixSpanModel<Item> - Class in org.apache.spark.mllib.fpm: Model fitted by PrefixSpan param: freqSequences frequent sequences
PrefixSpanModel(RDD<PrefixSpan.FreqSequence<Item>>) - Constructor for class org.apache.spark.mllib.fpm.PrefixSpanModel
prefLoc() - Method in class org.apache.spark.rdd.PartitionGroup
pregel(A, int, EdgeDirection, Function3<Object, VD, A, VD>, Function1<EdgeTriplet<VD, ED>, Iterator<Tuple2<Object, A>>>, Function2<A, A, A>, ClassTag<A>) - Method in class org.apache.spark.graphx.GraphOps: Execute a Pregel-like iterative vertex-parallel abstraction.
Pregel - Class in org.apache.spark.graphx: Implements a Pregel-like bulk-synchronous message-passing API.
Pregel() - Constructor for class org.apache.spark.graphx.Pregel
prepareForExecution() - Method in class org.apache.spark.sql.SQLContext
prepareJobForWrite(Job) - Method in class org.apache.spark.sql.sources.HadoopFsRelation: Prepares a write job and returns an OutputWriterFactory.
prettyJson() - Method in class org.apache.spark.sql.types.DataType: The pretty (i.e.
prettyPrint() - Method in class org.apache.spark.streaming.Duration
prev() - Method in class org.apache.spark.rdd.ShuffledRDD
prevHandler() - Method in class org.apache.spark.util.SignalLoggerHandler
primitiveTypes() - Static method in class org.apache.spark.sql.hive.HiveContext
print() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Print the first ten elements of each RDD generated in this DStream.
print(int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Print the first num elements of each RDD generated in this DStream.
print() - Method in class org.apache.spark.streaming.dstream.DStream: Print the first ten elements of each RDD generated in this DStream.
print(int) - Method in class org.apache.spark.streaming.dstream.DStream: Print the first num elements of each RDD generated in this DStream.
printSchema() - Method in class org.apache.spark.sql.DataFrame: Prints the schema to the console in a nice tree format.
printSchema() - Method in class org.apache.spark.sql.Dataset: Prints the schema of the underlying Dataset to the console in a nice tree format.
printStats() - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
printTreeString() - Method in class org.apache.spark.sql.types.StructType
Private - Annotation Type in org.apache.spark.annotation: A class that is considered private to the internals of Spark -- there is a high-likelihood they will be changed in future versions of Spark.
prob() - Method in class org.apache.spark.mllib.tree.model.Predict
ProbabilisticClassificationModel<FeaturesType,M extends ProbabilisticClassificationModel<FeaturesType,M>> - Class in org.apache.spark.ml.classification: :: DeveloperApi ::
ProbabilisticClassificationModel() - Constructor for class org.apache.spark.ml.classification.ProbabilisticClassificationModel
ProbabilisticClassifier<FeaturesType,E extends ProbabilisticClassifier<FeaturesType,E,M>,M extends ProbabilisticClassificationModel<FeaturesType,M>> - Class in org.apache.spark.ml.classification: :: DeveloperApi ::
ProbabilisticClassifier() - Constructor for class org.apache.spark.ml.classification.ProbabilisticClassifier
probabilities() - Static method in class org.apache.spark.scheduler.StatsReportListener
probability2prediction(Vector) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
probability2prediction(Vector) - Method in class org.apache.spark.ml.classification.ProbabilisticClassificationModel: Given a vector of class conditional probabilities, select the predicted label.
probabilityCol() - Method in class org.apache.spark.ml.classification.BinaryLogisticRegressionSummary
probabilityCol() - Method in interface org.apache.spark.ml.classification.LogisticRegressionSummary: Field in "predictions" which gives the calibrated probability of each instance as a vector.
PROCESS_LOCAL() - Static method in class org.apache.spark.scheduler.TaskLocality
processingDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo: Time taken for the all jobs of this batch to finish processing from the time they started processing.
processingEndTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
processingStartTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
product() - Method in class org.apache.spark.mllib.recommendation.Rating
productFeatures() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
progressListener() - Method in class org.apache.spark.streaming.StreamingContext
properties() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
properties() - Method in class org.apache.spark.scheduler.SparkListenerStageSubmitted
PrunedFilteredScan - Interface in org.apache.spark.sql.sources: ::DeveloperApi:: A BaseRelation that can eliminate unneeded columns and filter using selected predicates before producing an RDD containing all matching tuples as Row objects.
PrunedScan - Interface in org.apache.spark.sql.sources: ::DeveloperApi:: A BaseRelation that can eliminate unneeded columns before producing an RDD containing all of its tuples as Row objects.
Pseudorandom - Interface in org.apache.spark.util.random: :: DeveloperApi :: A class with pseudorandom behavior.
put(ParamPair<?>...) - Method in class org.apache.spark.ml.param.ParamMap: Puts a list of param pairs (overwrites if the input params exists).
put(Param<T>, T) - Method in class org.apache.spark.ml.param.ParamMap: Puts a (param, value) pair (overwrites if the input param exists).
put(Seq<ParamPair<?>>) - Method in class org.apache.spark.ml.param.ParamMap: Puts a list of param pairs (overwrites if the input params exists).
putBoolean(String, boolean) - Method in class org.apache.spark.sql.types.MetadataBuilder: Puts a Boolean.
putBooleanArray(String, boolean[]) - Method in class org.apache.spark.sql.types.MetadataBuilder: Puts a Boolean array.
putDouble(String, double) - Method in class org.apache.spark.sql.types.MetadataBuilder: Puts a Double.
putDoubleArray(String, double[]) - Method in class org.apache.spark.sql.types.MetadataBuilder: Puts a Double array.
putLong(String, long) - Method in class org.apache.spark.sql.types.MetadataBuilder: Puts a Long.
putLongArray(String, long[]) - Method in class org.apache.spark.sql.types.MetadataBuilder: Puts a Long array.
putMetadata(String, Metadata) - Method in class org.apache.spark.sql.types.MetadataBuilder: Puts a Metadata.
putMetadataArray(String, Metadata[]) - Method in class org.apache.spark.sql.types.MetadataBuilder: Puts a Metadata array.
putString(String, String) - Method in class org.apache.spark.sql.types.MetadataBuilder: Puts a String.
putStringArray(String, String[]) - Method in class org.apache.spark.sql.types.MetadataBuilder: Puts a String array.
pValue() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
pValue() - Method in class org.apache.spark.mllib.stat.test.KolmogorovSmirnovTestResult
pValue() - Method in interface org.apache.spark.mllib.stat.test.TestResult: The probability of obtaining a test statistic result at least as extreme as the one that was actually observed, assuming that the null hypothesis is true.
pValues() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary: Two-sided p-value of estimated coefficients and intercept.
pyUDT() - Method in class org.apache.spark.mllib.linalg.VectorUDT
pyUDT() - Method in class org.apache.spark.sql.types.UserDefinedType: Paired Python UDT class, if exists.

Q

Q() - Method in class org.apache.spark.mllib.linalg.QRDecomposition
QRDecomposition<QType,RType> - Class in org.apache.spark.mllib.linalg
QRDecomposition(QType, RType) - Constructor for class org.apache.spark.mllib.linalg.QRDecomposition
quantileCalculationStrategy() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
QuantileDiscretizer - Class in org.apache.spark.ml.feature: :: Experimental :: QuantileDiscretizer takes a column with continuous features and outputs a column with binned categorical features.
QuantileDiscretizer(String) - Constructor for class org.apache.spark.ml.feature.QuantileDiscretizer
QuantileDiscretizer() - Constructor for class org.apache.spark.ml.feature.QuantileDiscretizer
quantiles() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
QuantileStrategy - Class in org.apache.spark.mllib.tree.configuration: Enum for selecting the quantile calculation strategy
QuantileStrategy() - Constructor for class org.apache.spark.mllib.tree.configuration.QuantileStrategy
quarter(Column) - Static method in class org.apache.spark.sql.functions: Extracts the quarter as an integer from a given date/timestamp/string.
queryExecution() - Method in class org.apache.spark.sql.DataFrame
queryExecution() - Method in class org.apache.spark.sql.Dataset
queryExecution() - Method in class org.apache.spark.sql.GroupedDataset
QueryExecutionListener - Interface in org.apache.spark.sql.util: :: Experimental :: The interface of query execution listener that can be used to analyze execution metrics.
queueStream(Queue<JavaRDD<T>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream from an queue of RDDs.
queueStream(Queue<JavaRDD<T>>, boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream from an queue of RDDs.
queueStream(Queue<JavaRDD<T>>, boolean, JavaRDD<T>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream from an queue of RDDs.
queueStream(Queue<RDD<T>>, boolean, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext: Create an input stream from a queue of RDDs.
queueStream(Queue<RDD<T>>, boolean, RDD<T>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext: Create an input stream from a queue of RDDs.
quoteIdentifier(String) - Method in class org.apache.spark.sql.jdbc.JdbcDialect: Quotes the identifier.
quoteIdentifier(String) - Static method in class org.apache.spark.sql.jdbc.MySQLDialect

R

R() - Method in class org.apache.spark.mllib.linalg.QRDecomposition
r2() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary: Returns R^2^, the coefficient of determination.
r2() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics: Returns R^2^, the unadjusted coefficient of determination.
RACK_LOCAL() - Static method in class org.apache.spark.scheduler.TaskLocality
rand(int, int, Random) - Static method in class org.apache.spark.mllib.linalg.DenseMatrix: Generate a DenseMatrix consisting of i.i.d. uniform random numbers.
rand(int, int, Random) - Static method in class org.apache.spark.mllib.linalg.Matrices: Generate a DenseMatrix consisting of i.i.d. uniform random numbers.
rand(long) - Static method in class org.apache.spark.sql.functions: Generate a random column with i.i.d.
rand() - Static method in class org.apache.spark.sql.functions: Generate a random column with i.i.d.
randn(int, int, Random) - Static method in class org.apache.spark.mllib.linalg.DenseMatrix: Generate a DenseMatrix consisting of i.i.d. gaussian random numbers.
randn(int, int, Random) - Static method in class org.apache.spark.mllib.linalg.Matrices: Generate a DenseMatrix consisting of i.i.d. gaussian random numbers.
randn(long) - Static method in class org.apache.spark.sql.functions: Generate a column with i.i.d.
randn() - Static method in class org.apache.spark.sql.functions: Generate a column with i.i.d.
RANDOM() - Static method in class org.apache.spark.mllib.clustering.KMeans
random(int, Random) - Static method in class org.apache.spark.util.Vector: Creates this Vector of given length containing random numbers between 0.0 and 1.0.
RandomDataGenerator<T> - Interface in org.apache.spark.mllib.random: :: DeveloperApi :: Trait for random data generators that generate i.i.d.
RandomForest - Class in org.apache.spark.mllib.tree: A class that implements a Random Forest learning algorithm for classification and regression.
RandomForest(Strategy, int, String, int) - Constructor for class org.apache.spark.mllib.tree.RandomForest
RandomForestClassificationModel - Class in org.apache.spark.ml.classification: :: Experimental :: Random Forest model for classification.
RandomForestClassifier - Class in org.apache.spark.ml.classification: :: Experimental :: Random Forest learning algorithm for classification.
RandomForestClassifier(String) - Constructor for class org.apache.spark.ml.classification.RandomForestClassifier
RandomForestClassifier() - Constructor for class org.apache.spark.ml.classification.RandomForestClassifier
RandomForestModel - Class in org.apache.spark.mllib.tree.model: Represents a random forest model.
RandomForestModel(Enumeration.Value, DecisionTreeModel[]) - Constructor for class org.apache.spark.mllib.tree.model.RandomForestModel
RandomForestRegressionModel - Class in org.apache.spark.ml.regression: :: Experimental :: Random Forest model for regression.
RandomForestRegressor - Class in org.apache.spark.ml.regression: :: Experimental :: Random Forest learning algorithm for regression.
RandomForestRegressor(String) - Constructor for class org.apache.spark.ml.regression.RandomForestRegressor
RandomForestRegressor() - Constructor for class org.apache.spark.ml.regression.RandomForestRegressor
randomJavaRDD(JavaSparkContext, RandomDataGenerator<T>, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: :: DeveloperApi :: Generates an RDD comprised of i.i.d. samples produced by the input RandomDataGenerator.
randomJavaRDD(JavaSparkContext, RandomDataGenerator<T>, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.randomJavaRDD(org.apache.spark.api.java.JavaSparkContext, org.apache.spark.mllib.random.RandomDataGenerator<T>, long, int, long) with the default seed.
randomJavaRDD(JavaSparkContext, RandomDataGenerator<T>, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.randomJavaRDD(org.apache.spark.api.java.JavaSparkContext, org.apache.spark.mllib.random.RandomDataGenerator<T>, long, int, long) with the default seed & numPartitions
randomJavaVectorRDD(JavaSparkContext, RandomDataGenerator<Object>, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Java-friendly version of RandomRDDs.randomVectorRDD(org.apache.spark.SparkContext, org.apache.spark.mllib.random.RandomDataGenerator<java.lang.Object>, long, int, int, long).
randomJavaVectorRDD(JavaSparkContext, RandomDataGenerator<Object>, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.randomJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, org.apache.spark.mllib.random.RandomDataGenerator<java.lang.Object>, long, int, int, long) with the default seed.
randomJavaVectorRDD(JavaSparkContext, RandomDataGenerator<Object>, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.randomJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, org.apache.spark.mllib.random.RandomDataGenerator<java.lang.Object>, long, int, int, long) with the default number of partitions and the default seed.
randomRDD(SparkContext, RandomDataGenerator<T>, long, int, long, ClassTag<T>) - Static method in class org.apache.spark.mllib.random.RandomRDDs: :: DeveloperApi :: Generates an RDD comprised of i.i.d. samples produced by the input RandomDataGenerator.
RandomRDDs - Class in org.apache.spark.mllib.random: Generator methods for creating RDDs comprised of i.i.d. samples from some distribution.
RandomRDDs() - Constructor for class org.apache.spark.mllib.random.RandomRDDs
RandomSampler<T,U> - Interface in org.apache.spark.util.random: :: DeveloperApi :: A pseudorandom sampler.
randomSplit(double[]) - Method in class org.apache.spark.api.java.JavaRDD: Randomly splits this RDD with the provided weights.
randomSplit(double[], long) - Method in class org.apache.spark.api.java.JavaRDD: Randomly splits this RDD with the provided weights.
randomSplit(double[], long) - Method in class org.apache.spark.rdd.RDD: Randomly splits this RDD with the provided weights.
randomSplit(double[], long) - Method in class org.apache.spark.sql.DataFrame: Randomly splits this DataFrame with the provided weights.
randomSplit(double[]) - Method in class org.apache.spark.sql.DataFrame: Randomly splits this DataFrame with the provided weights.
randomVectorRDD(SparkContext, RandomDataGenerator<Object>, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: :: DeveloperApi :: Generates an RDD[Vector] with vectors containing i.i.d. samples produced by the input RandomDataGenerator.
range(long, long, long, int) - Method in class org.apache.spark.SparkContext: Creates a new RDD[Long] containing elements from start to end(exclusive), increased by step every element.
range(long) - Method in class org.apache.spark.sql.SQLContext
range(long, long) - Method in class org.apache.spark.sql.SQLContext
range(long, long, long, int) - Method in class org.apache.spark.sql.SQLContext
rangeBetween(long, long) - Method in class org.apache.spark.sql.expressions.WindowSpec: Defines the frame boundaries, from start (inclusive) to end (inclusive).
RangeDependency<T> - Class in org.apache.spark: :: DeveloperApi :: Represents a one-to-one dependency between ranges of partitions in the parent and child RDDs.
RangeDependency(RDD<T>, int, int, int) - Constructor for class org.apache.spark.RangeDependency
RangePartitioner<K,V> - Class in org.apache.spark: A Partitioner that partitions sortable records by range into roughly equal ranges.
RangePartitioner(int, RDD<? extends Product2<K, V>>, boolean, Ordering<K>, ClassTag<K>) - Constructor for class org.apache.spark.RangePartitioner
rank() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
rank() - Method in class org.apache.spark.ml.recommendation.ALSModel
rank() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
rank() - Static method in class org.apache.spark.sql.functions: Window function: returns the rank of rows within a window partition.
RankingMetrics<T> - Class in org.apache.spark.mllib.evaluation: ::Experimental:: Evaluator for ranking algorithms.
RankingMetrics(RDD<Tuple2<Object, Object>>, ClassTag<T>) - Constructor for class org.apache.spark.mllib.evaluation.RankingMetrics
rateController() - Method in class org.apache.spark.streaming.dstream.InputDStream
rateController() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream: Asynchronously maintains & sends new rate limits to the receiver through the receiver tracker.
rating() - Method in class org.apache.spark.ml.recommendation.ALS.Rating
Rating - Class in org.apache.spark.mllib.recommendation: A more compact class to represent a rating than Tuple3[Int, Int, Double].
Rating(int, int, double) - Constructor for class org.apache.spark.mllib.recommendation.Rating
rating() - Method in class org.apache.spark.mllib.recommendation.Rating
raw2prediction(Vector) - Method in class org.apache.spark.ml.classification.ClassificationModel: Given a vector of raw predictions, select the predicted label.
raw2prediction(Vector) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
raw2prediction(Vector) - Method in class org.apache.spark.ml.classification.ProbabilisticClassificationModel
raw2probability(Vector) - Method in class org.apache.spark.ml.classification.ProbabilisticClassificationModel: Non-in-place version of raw2probabilityInPlace()
raw2probabilityInPlace(Vector) - Method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
raw2probabilityInPlace(Vector) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
raw2probabilityInPlace(Vector) - Method in class org.apache.spark.ml.classification.NaiveBayesModel
raw2probabilityInPlace(Vector) - Method in class org.apache.spark.ml.classification.ProbabilisticClassificationModel: Estimate the probability of each class given the raw prediction, doing the computation in-place.
raw2probabilityInPlace(Vector) - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel
rawSocketStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream from network source hostname:port, where data is received as serialized blocks (serialized using the Spark's serializer) that can be directly pushed into the block manager without deserializing them.
rawSocketStream(String, int) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream from network source hostname:port, where data is received as serialized blocks (serialized using the Spark's serializer) that can be directly pushed into the block manager without deserializing them.
rawSocketStream(String, int, StorageLevel, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext: Create a input stream from network source hostname:port, where data is received as serialized blocks (serialized using the Spark's serializer) that can be directly pushed into the block manager without deserializing them.
rdd() - Method in class org.apache.spark.api.java.JavaDoubleRDD
rdd() - Method in class org.apache.spark.api.java.JavaPairRDD
rdd() - Method in class org.apache.spark.api.java.JavaRDD
rdd() - Method in interface org.apache.spark.api.java.JavaRDDLike
rdd() - Method in class org.apache.spark.Dependency
rdd() - Method in class org.apache.spark.NarrowDependency
RDD<T> - Class in org.apache.spark.rdd: A Resilient Distributed Dataset (RDD), the basic abstraction in Spark.
RDD(SparkContext, Seq<Dependency<?>>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.RDD
RDD(RDD<?>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.RDD: Construct an RDD with just a one-to-one dependency on one parent
rdd() - Method in class org.apache.spark.ShuffleDependency
rdd() - Method in class org.apache.spark.sql.DataFrame: Represents the content of the DataFrame as an RDD of Rows.
rdd() - Method in class org.apache.spark.sql.Dataset: Converts this Dataset to an RDD.
RDD() - Static method in class org.apache.spark.storage.BlockId
RDD_SCOPE_KEY() - Static method in class org.apache.spark.SparkContext
RDD_SCOPE_NO_OVERRIDE_KEY() - Static method in class org.apache.spark.SparkContext
RDDBlockId - Class in org.apache.spark.storage
RDDBlockId(int, int) - Constructor for class org.apache.spark.storage.RDDBlockId
rddBlocks() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
rddBlocks() - Method in class org.apache.spark.storage.StorageStatus: Return the RDD blocks stored in this block manager.
rddBlocksById(int) - Method in class org.apache.spark.storage.StorageStatus: Return the blocks that belong to the given RDD stored in this block manager.
RDDDataDistribution - Class in org.apache.spark.status.api.v1
RDDFunctions<T> - Class in org.apache.spark.mllib.rdd: Machine learning specific RDD functions.
RDDFunctions(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.mllib.rdd.RDDFunctions
rddId() - Method in class org.apache.spark.CleanCheckpoint
rddId() - Method in class org.apache.spark.CleanRDD
rddId() - Method in class org.apache.spark.scheduler.SparkListenerUnpersistRDD
rddId() - Method in class org.apache.spark.storage.RDDBlockId
RDDInfo - Class in org.apache.spark.storage
RDDInfo(int, String, int, StorageLevel, Seq<Object>, String, Option<org.apache.spark.rdd.RDDOperationScope>) - Constructor for class org.apache.spark.storage.RDDInfo
rddInfoList() - Method in class org.apache.spark.ui.storage.StorageListener: Filter RDD info to include only those with cached partitions
rddInfos() - Method in class org.apache.spark.scheduler.StageInfo
RDDPartitionInfo - Class in org.apache.spark.status.api.v1
rdds() - Method in class org.apache.spark.rdd.CoGroupedRDD
rdds() - Method in class org.apache.spark.rdd.UnionRDD
RDDStorageInfo - Class in org.apache.spark.status.api.v1
rddStorageLevel(int) - Method in class org.apache.spark.storage.StorageStatus: Return the storage level, if any, used by the given RDD in this block manager.
rddToAsyncRDDActions(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.rdd.RDD
rddToAsyncRDDActions(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.SparkContext
rddToDataFrameHolder(RDD<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLImplicits: Creates a DataFrame from an RDD of Product (e.g.
rddToDatasetHolder(RDD<T>, Encoder<T>) - Method in class org.apache.spark.sql.SQLImplicits: Creates a Dataset from an RDD.
rddToOrderedRDDFunctions(RDD<Tuple2<K, V>>, Ordering<K>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.rdd.RDD
rddToOrderedRDDFunctions(RDD<Tuple2<K, V>>, Ordering<K>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.SparkContext
rddToPairRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Static method in class org.apache.spark.rdd.RDD
rddToPairRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Static method in class org.apache.spark.SparkContext
rddToSequenceFileRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, <any>, <any>) - Static method in class org.apache.spark.rdd.RDD
rddToSequenceFileRDDFunctions(RDD<Tuple2<K, V>>, Function1<K, Writable>, ClassTag<K>, Function1<V, Writable>, ClassTag<V>) - Static method in class org.apache.spark.SparkContext
read() - Method in class org.apache.spark.api.r.BaseRRDD
read() - Static method in class org.apache.spark.ml.classification.LogisticRegressionModel
read() - Static method in class org.apache.spark.ml.classification.NaiveBayesModel
read() - Static method in class org.apache.spark.ml.clustering.DistributedLDAModel
read() - Static method in class org.apache.spark.ml.clustering.KMeansModel
read() - Static method in class org.apache.spark.ml.clustering.LocalLDAModel
read() - Static method in class org.apache.spark.ml.feature.ChiSqSelectorModel
read() - Static method in class org.apache.spark.ml.feature.CountVectorizerModel
read() - Static method in class org.apache.spark.ml.feature.IDFModel
read() - Static method in class org.apache.spark.ml.feature.MinMaxScalerModel
read() - Static method in class org.apache.spark.ml.feature.PCAModel
read() - Static method in class org.apache.spark.ml.feature.StandardScalerModel
read() - Static method in class org.apache.spark.ml.feature.StringIndexerModel
read() - Static method in class org.apache.spark.ml.feature.VectorIndexerModel
read() - Static method in class org.apache.spark.ml.feature.Word2VecModel
read() - Static method in class org.apache.spark.ml.Pipeline
read() - Static method in class org.apache.spark.ml.PipelineModel
read() - Static method in class org.apache.spark.ml.recommendation.ALSModel
read() - Static method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
read() - Static method in class org.apache.spark.ml.regression.IsotonicRegressionModel
read() - Static method in class org.apache.spark.ml.regression.LinearRegressionModel
read() - Static method in class org.apache.spark.ml.tuning.CrossValidator
read() - Static method in class org.apache.spark.ml.tuning.CrossValidatorModel
read() - Method in interface org.apache.spark.ml.util.MLReadable: Returns an MLReader instance for this class.
read(Kryo, Input, Class<Iterable<?>>) - Method in class org.apache.spark.serializer.JavaIterableWrapperSerializer
read() - Method in class org.apache.spark.sql.SQLContext
read() - Method in class org.apache.spark.storage.BufferReleasingInputStream
read(byte[]) - Method in class org.apache.spark.storage.BufferReleasingInputStream
read(byte[], int, int) - Method in class org.apache.spark.storage.BufferReleasingInputStream
read(WriteAheadLogRecordHandle) - Method in class org.apache.spark.streaming.util.WriteAheadLog: Read a written record based on the given record handle.
readAll() - Method in class org.apache.spark.streaming.util.WriteAheadLog: Read and return an iterator of all the records that have been written but not yet cleaned up.
readBytes() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetricDistributions
readData(int) - Method in class org.apache.spark.api.r.BaseRRDD
readData(int) - Method in class org.apache.spark.api.r.PairwiseRRDD
readData(int) - Method in class org.apache.spark.api.r.RRDD
readData(int) - Method in class org.apache.spark.api.r.StringRRDD
readExternal(ObjectInput) - Method in class org.apache.spark.serializer.JavaSerializer
readExternal(ObjectInput) - Method in class org.apache.spark.storage.BlockManagerId
readExternal(ObjectInput) - Method in class org.apache.spark.storage.StorageLevel
readExternal(ObjectInput) - Method in class org.apache.spark.streaming.flume.SparkFlumeEvent
readKey(ClassTag<T>) - Method in class org.apache.spark.serializer.DeserializationStream: Reads the object representing the key of a key-value pair.
readObject(ClassTag<T>) - Method in class org.apache.spark.serializer.DeserializationStream: The most general-purpose method to read an object.
readRecords() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetricDistributions
readValue(ClassTag<T>) - Method in class org.apache.spark.serializer.DeserializationStream: Reads the object representing the value of a key-value pair.
ready(Duration, CanAwait) - Method in class org.apache.spark.ComplexFutureAction
ready(Duration, CanAwait) - Method in interface org.apache.spark.FutureAction: Blocks until this action completes.
ready(Duration, CanAwait) - Method in class org.apache.spark.SimpleFutureAction
reason() - Method in class org.apache.spark.ExecutorLostFailure
reason() - Method in class org.apache.spark.scheduler.SparkListenerExecutorRemoved
reason() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
recall(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns recall for a given label (category)
recall() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns recall (equals to precision for multiclass classifier because sum of all false positives is equal to sum of all false negatives)
recall() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics: Returns document-based recall averaged by the number of documents
recall(double) - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics: Returns recall for a given label (category)
recallByThreshold() - Method in class org.apache.spark.ml.classification.BinaryLogisticRegressionSummary: Returns a dataframe with two fields (threshold, recall) curve.
recallByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Returns the (threshold, recall) curve.
Receiver<T> - Class in org.apache.spark.streaming.receiver: :: DeveloperApi :: Abstract class of a receiver that can be run on worker nodes to receive external data.
Receiver(StorageLevel) - Constructor for class org.apache.spark.streaming.receiver.Receiver
ReceiverInfo - Class in org.apache.spark.streaming.scheduler: :: DeveloperApi :: Class having information about a receiver
ReceiverInfo(int, String, boolean, String, String, String, String, long) - Constructor for class org.apache.spark.streaming.scheduler.ReceiverInfo
receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverError
receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStarted
receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStopped
receiverInputDStream() - Method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
receiverInputDStream() - Method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
ReceiverInputDStream<T> - Class in org.apache.spark.streaming.dstream: Abstract class for defining any InputDStream that has to start a receiver on worker nodes to receive external data.
ReceiverInputDStream(StreamingContext, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.ReceiverInputDStream
receiverStream(Receiver<T>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream with any arbitrary user implemented receiver.
receiverStream(Receiver<T>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext: Create an input stream with any arbitrary user implemented receiver.
recommendProducts(int, int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel: Recommends products to a user.
recommendProductsForUsers(int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel: Recommends topK products for all users.
recommendUsers(int, int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel: Recommends users to a product.
recommendUsersForProducts(int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel: Recommends topK users for all products.
recordJobProperties(int, Properties) - Method in class org.apache.spark.scheduler.JobLogger: Record job properties into job log file
RECORDS_BETWEEN_BYTES_READ_METRIC_UPDATES() - Static method in class org.apache.spark.rdd.HadoopRDD: Update the input bytes read metric each time this number of records has been read
RECORDS_BETWEEN_BYTES_WRITTEN_METRIC_UPDATES() - Static method in class org.apache.spark.rdd.PairRDDFunctions
recordsRead() - Method in class org.apache.spark.status.api.v1.InputMetricDistributions
recordsRead() - Method in class org.apache.spark.status.api.v1.InputMetrics
recordsRead() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetrics
recordsWritten() - Method in class org.apache.spark.status.api.v1.OutputMetricDistributions
recordsWritten() - Method in class org.apache.spark.status.api.v1.OutputMetrics
recordsWritten() - Method in class org.apache.spark.status.api.v1.ShuffleWriteMetrics
recordTaskMetrics(int, String, TaskInfo, TaskMetrics) - Method in class org.apache.spark.scheduler.JobLogger: Record task metrics into job log files, including execution info and shuffle metrics
reduce(Function2<T, T, T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Reduces the elements of this RDD using the specified commutative and associative binary operator.
reduce(Function2<T, T, T>) - Method in class org.apache.spark.rdd.RDD: Reduces the elements of this RDD using the specified commutative and associative binary operator.
reduce(Function2<T, T, T>) - Method in class org.apache.spark.sql.Dataset: (Scala-specific) Reduces the elements of this Dataset using the specified binary function.
reduce(ReduceFunction<T>) - Method in class org.apache.spark.sql.Dataset: (Java-specific) Reduces the elements of this Dataset using the specified binary function.
reduce(B, I) - Method in class org.apache.spark.sql.expressions.Aggregator: Combine two values to produce a new value.
reduce(Function2<V, V, V>) - Method in class org.apache.spark.sql.GroupedDataset: Reduces the elements of each group of data using the specified binary function.
reduce(ReduceFunction<V>) - Method in class org.apache.spark.sql.GroupedDataset: Reduces the elements of each group of data using the specified binary function.
reduce(Function2<T, T, T>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD has a single element generated by reducing each RDD of this DStream.
reduce(Function2<T, T, T>) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD has a single element generated by reducing each RDD of this DStream.
reduceByKey(Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD: Merge the values for each key using an associative reduce function.
reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Merge the values for each key using an associative reduce function.
reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD: Merge the values for each key using an associative reduce function.
reduceByKey(Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Merge the values for each key using an associative reduce function.
reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: Merge the values for each key using an associative reduce function.
reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Merge the values for each key using an associative reduce function.
reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying reduceByKey to each RDD.
reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying reduceByKey to each RDD.
reduceByKey(Function2<V, V, V>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying reduceByKey to each RDD.
reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying reduceByKey to each RDD.
reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying reduceByKey to each RDD.
reduceByKey(Function2<V, V, V>, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying reduceByKey to each RDD.
reduceByKeyAndWindow(Function2<V, V, V>, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Create a new DStream by applying reduceByKey over a sliding window on this DStream.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by reducing over a using incremental computation.
reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, int, Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying incremental reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, Partitioner, Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying incremental reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying reduceByKey over a sliding window on this DStream.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, int, Function1<Tuple2<K, V>, Object>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying incremental reduceByKey over a sliding window.
reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, Partitioner, Function1<Tuple2<K, V>, Object>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying incremental reduceByKey over a sliding window.
reduceByKeyLocally(Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD: Merge the values for each key using an associative reduce function, but return the results immediately to the master as a Map.
reduceByKeyLocally(Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Merge the values for each key using an associative reduce function, but return the results immediately to the master as a Map.
reduceByKeyToDriver(Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Alias for reduceByKeyLocally
reduceByWindow(Function2<T, T, T>, Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Deprecated.
As this API is not Java compatible.
reduceByWindow(Function2<T, T, T>, Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD has a single element generated by reducing all elements in a sliding window over this DStream.
reduceByWindow(Function2<T, T, T>, Function2<T, T, T>, Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD has a single element generated by reducing all elements in a sliding window over this DStream.
reduceByWindow(Function2<T, T, T>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD has a single element generated by reducing all elements in a sliding window over this DStream.
reduceByWindow(Function2<T, T, T>, Function2<T, T, T>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD has a single element generated by reducing all elements in a sliding window over this DStream.
ReduceFunction<T> - Interface in org.apache.spark.api.java.function: Base interface for function used in Dataset's reduce.
reduceId() - Method in class org.apache.spark.FetchFailed
reduceId() - Method in class org.apache.spark.storage.ShuffleBlockId
reduceId() - Method in class org.apache.spark.storage.ShuffleDataBlockId
reduceId() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
refreshTable(String) - Method in class org.apache.spark.sql.hive.HiveContext: Invalidate and refresh all the cached the metadata of the given table.
regexp_extract(Column, String, int) - Static method in class org.apache.spark.sql.functions: Extract a specific(idx) group identified by a java regex, from the specified string column.
regexp_replace(Column, String, String) - Static method in class org.apache.spark.sql.functions: Replace all substrings of the specified string value that match regexp with rep.
RegexTokenizer - Class in org.apache.spark.ml.feature: :: Experimental :: A regex based tokenizer that extracts tokens either by using the provided regex pattern to split the text (default) or repeatedly matching the regex (if gaps is false).
RegexTokenizer(String) - Constructor for class org.apache.spark.ml.feature.RegexTokenizer
RegexTokenizer() - Constructor for class org.apache.spark.ml.feature.RegexTokenizer
register(String, UserDefinedAggregateFunction) - Method in class org.apache.spark.sql.UDFRegistration: Register a user-defined aggregate function (UDAF).
register(String, Function0<RT>, TypeTags.TypeTag<RT>) - Method in class org.apache.spark.sql.UDFRegistration: Register a Scala closure of 0 arguments as user-defined function (UDF).
register(String, Function1<A1, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>) - Method in class org.apache.spark.sql.UDFRegistration: Register a Scala closure of 1 arguments as user-defined function (UDF).
register(String, Function2<A1, A2, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>) - Method in class org.apache.spark.sql.UDFRegistration: Register a Scala closure of 2 arguments as user-defined function (UDF).
register(String, Function3<A1, A2, A3, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>) - Method in class org.apache.spark.sql.UDFRegistration: Register a Scala closure of 3 arguments as user-defined function (UDF).
register(String, Function4<A1, A2, A3, A4, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>) - Method in class org.apache.spark.sql.UDFRegistration: Register a Scala closure of 4 arguments as user-defined function (UDF).
register(String, Function5<A1, A2, A3, A4, A5, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>) - Method in class org.apache.spark.sql.UDFRegistration: Register a Scala closure of 5 arguments as user-defined function (UDF).
register(String, Function6<A1, A2, A3, A4, A5, A6, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>) - Method in class org.apache.spark.sql.UDFRegistration: Register a Scala closure of 6 arguments as user-defined function (UDF).
register(String, Function7<A1, A2, A3, A4, A5, A6, A7, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>) - Method in class org.apache.spark.sql.UDFRegistration: Register a Scala closure of 7 arguments as user-defined function (UDF).
register(String, Function8<A1, A2, A3, A4, A5, A6, A7, A8, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>) - Method in class org.apache.spark.sql.UDFRegistration: Register a Scala closure of 8 arguments as user-defined function (UDF).
register(String, Function9<A1, A2, A3, A4, A5, A6, A7, A8, A9, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>) - Method in class org.apache.spark.sql.UDFRegistration: Register a Scala closure of 9 arguments as user-defined function (UDF).
register(String, Function10<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>) - Method in class org.apache.spark.sql.UDFRegistration: Register a Scala closure of 10 arguments as user-defined function (UDF).
register(String, Function11<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>) - Method in class org.apache.spark.sql.UDFRegistration: Register a Scala closure of 11 arguments as user-defined function (UDF).
register(String, Function12<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>) - Method in class org.apache.spark.sql.UDFRegistration: Register a Scala closure of 12 arguments as user-defined function (UDF).
register(String, Function13<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>) - Method in class org.apache.spark.sql.UDFRegistration: Register a Scala closure of 13 arguments as user-defined function (UDF).
register(String, Function14<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>) - Method in class org.apache.spark.sql.UDFRegistration: Register a Scala closure of 14 arguments as user-defined function (UDF).
register(String, Function15<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>) - Method in class org.apache.spark.sql.UDFRegistration: Register a Scala closure of 15 arguments as user-defined function (UDF).
register(String, Function16<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>) - Method in class org.apache.spark.sql.UDFRegistration: Register a Scala closure of 16 arguments as user-defined function (UDF).
register(String, Function17<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>) - Method in class org.apache.spark.sql.UDFRegistration: Register a Scala closure of 17 arguments as user-defined function (UDF).
register(String, Function18<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, A18, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>, TypeTags.TypeTag<A18>) - Method in class org.apache.spark.sql.UDFRegistration: Register a Scala closure of 18 arguments as user-defined function (UDF).
register(String, Function19<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, A18, A19, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>, TypeTags.TypeTag<A18>, TypeTags.TypeTag<A19>) - Method in class org.apache.spark.sql.UDFRegistration: Register a Scala closure of 19 arguments as user-defined function (UDF).
register(String, Function20<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, A18, A19, A20, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>, TypeTags.TypeTag<A18>, TypeTags.TypeTag<A19>, TypeTags.TypeTag<A20>) - Method in class org.apache.spark.sql.UDFRegistration: Register a Scala closure of 20 arguments as user-defined function (UDF).
register(String, Function21<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, A18, A19, A20, A21, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>, TypeTags.TypeTag<A18>, TypeTags.TypeTag<A19>, TypeTags.TypeTag<A20>, TypeTags.TypeTag<A21>) - Method in class org.apache.spark.sql.UDFRegistration: Register a Scala closure of 21 arguments as user-defined function (UDF).
register(String, Function22<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, A18, A19, A20, A21, A22, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>, TypeTags.TypeTag<A18>, TypeTags.TypeTag<A19>, TypeTags.TypeTag<A20>, TypeTags.TypeTag<A21>, TypeTags.TypeTag<A22>) - Method in class org.apache.spark.sql.UDFRegistration: Register a Scala closure of 22 arguments as user-defined function (UDF).
register(String, UDF1<?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a user-defined function with 1 arguments.
register(String, UDF2<?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a user-defined function with 2 arguments.
register(String, UDF3<?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a user-defined function with 3 arguments.
register(String, UDF4<?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a user-defined function with 4 arguments.
register(String, UDF5<?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a user-defined function with 5 arguments.
register(String, UDF6<?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a user-defined function with 6 arguments.
register(String, UDF7<?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a user-defined function with 7 arguments.
register(String, UDF8<?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a user-defined function with 8 arguments.
register(String, UDF9<?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a user-defined function with 9 arguments.
register(String, UDF10<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a user-defined function with 10 arguments.
register(String, UDF11<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a user-defined function with 11 arguments.
register(String, UDF12<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a user-defined function with 12 arguments.
register(String, UDF13<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a user-defined function with 13 arguments.
register(String, UDF14<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a user-defined function with 14 arguments.
register(String, UDF15<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a user-defined function with 15 arguments.
register(String, UDF16<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a user-defined function with 16 arguments.
register(String, UDF17<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a user-defined function with 17 arguments.
register(String, UDF18<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a user-defined function with 18 arguments.
register(String, UDF19<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a user-defined function with 19 arguments.
register(String, UDF20<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a user-defined function with 20 arguments.
register(String, UDF21<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a user-defined function with 21 arguments.
register(String, UDF22<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration: Register a user-defined function with 22 arguments.
register(QueryExecutionListener) - Method in class org.apache.spark.sql.util.ExecutionListenerManager: Registers the specified QueryExecutionListener.
registerAvroSchemas(Seq<Schema>) - Method in class org.apache.spark.SparkConf: Use Kryo serialization and register the given set of Avro schemas so that the generic record serializer can decrease network IO
registerClasses(Kryo) - Method in class org.apache.spark.graphx.GraphKryoRegistrator
registerClasses(Kryo) - Method in interface org.apache.spark.serializer.KryoRegistrator
registerDialect(JdbcDialect) - Static method in class org.apache.spark.sql.jdbc.JdbcDialects: Register a dialect for use on all new matching jdbc DataFrame.
registerKryoClasses(SparkConf) - Static method in class org.apache.spark.graphx.GraphXUtils: Registers classes that GraphX uses with Kryo.
registerKryoClasses(Class<?>[]) - Method in class org.apache.spark.SparkConf: Use Kryo serialization and register the given set of classes with Kryo.
registerPython(String, UserDefinedPythonFunction) - Method in class org.apache.spark.sql.UDFRegistration
registerStream(DStream<BinarySample>) - Method in class org.apache.spark.mllib.stat.test.StreamingTest
registerStream(JavaDStream<BinarySample>) - Method in class org.apache.spark.mllib.stat.test.StreamingTest
registerTempTable(String) - Method in class org.apache.spark.sql.DataFrame: Registers this DataFrame as a temporary table using the given name.
Regression() - Static method in class org.apache.spark.mllib.tree.configuration.Algo
RegressionEvaluator - Class in org.apache.spark.ml.evaluation: :: Experimental :: Evaluator for regression, which expects two input columns: prediction and label.
RegressionEvaluator(String) - Constructor for class org.apache.spark.ml.evaluation.RegressionEvaluator
RegressionEvaluator() - Constructor for class org.apache.spark.ml.evaluation.RegressionEvaluator
RegressionMetrics - Class in org.apache.spark.mllib.evaluation: Evaluator for regression.
RegressionMetrics(RDD<Tuple2<Object, Object>>) - Constructor for class org.apache.spark.mllib.evaluation.RegressionMetrics
RegressionModel<FeaturesType,M extends RegressionModel<FeaturesType,M>> - Class in org.apache.spark.ml.regression: :: DeveloperApi ::
RegressionModel() - Constructor for class org.apache.spark.ml.regression.RegressionModel
RegressionModel - Interface in org.apache.spark.mllib.regression
reindex() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
reindex() - Method in class org.apache.spark.graphx.VertexRDD: Construct a new VertexRDD that is indexed by only the visible vertices.
RelationProvider - Interface in org.apache.spark.sql.sources: ::DeveloperApi:: Implemented by objects that produce relations for a specific kind of data source.
relativeDirection(long) - Method in class org.apache.spark.graphx.Edge: Return the relative direction of the edge to the corresponding vertex.
remainder(Decimal) - Method in class org.apache.spark.sql.types.Decimal
remember(Duration) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Sets each DStreams in this context to remember RDDs it generated in the last given duration.
remember(Duration) - Method in class org.apache.spark.streaming.StreamingContext: Set each DStreams in this context to remember RDDs it generated in the last given duration.
rememberDuration() - Method in class org.apache.spark.streaming.dstream.DStream
remoteBlocksFetched() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetricDistributions
remoteBlocksFetched() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetrics
remoteBytesRead() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetricDistributions
remoteBytesRead() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetrics
remove(Param<T>) - Method in class org.apache.spark.ml.param.ParamMap: Removes a key from this map and returns its value associated previously as an option.
remove(String) - Method in class org.apache.spark.SparkConf: Remove a parameter from the configuration
remove() - Method in class org.apache.spark.streaming.State: Remove the state if it exists.
repartition(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a new RDD that has exactly numPartitions partitions.
repartition(int) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a new RDD that has exactly numPartitions partitions.
repartition(int) - Method in class org.apache.spark.api.java.JavaRDD: Return a new RDD that has exactly numPartitions partitions.
repartition(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Return a new RDD that has exactly numPartitions partitions.
repartition(int, Column...) - Method in class org.apache.spark.sql.DataFrame: Returns a new DataFrame partitioned by the given partitioning expressions into numPartitions.
repartition(Column...) - Method in class org.apache.spark.sql.DataFrame: Returns a new DataFrame partitioned by the given partitioning expressions preserving the existing number of partitions.
repartition(int) - Method in class org.apache.spark.sql.DataFrame: Returns a new DataFrame that has exactly numPartitions partitions.
repartition(int, Seq<Column>) - Method in class org.apache.spark.sql.DataFrame: Returns a new DataFrame partitioned by the given partitioning expressions into numPartitions.
repartition(Seq<Column>) - Method in class org.apache.spark.sql.DataFrame: Returns a new DataFrame partitioned by the given partitioning expressions preserving the existing number of partitions.
repartition(int) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset that has exactly numPartitions partitions.
repartition(int) - Method in class org.apache.spark.streaming.api.java.JavaDStream: Return a new DStream with an increased or decreased level of parallelism.
repartition(int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream with an increased or decreased level of parallelism.
repartition(int) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream with an increased or decreased level of parallelism.
repartitionAndSortWithinPartitions(Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Repartition the RDD according to the given partitioner and, within each resulting partition, sort records by their keys.
repartitionAndSortWithinPartitions(Partitioner, Comparator<K>) - Method in class org.apache.spark.api.java.JavaPairRDD: Repartition the RDD according to the given partitioner and, within each resulting partition, sort records by their keys.
repartitionAndSortWithinPartitions(Partitioner) - Method in class org.apache.spark.rdd.OrderedRDDFunctions: Repartition the RDD according to the given partitioner and, within each resulting partition, sort records by their keys.
repeat(Column, int) - Static method in class org.apache.spark.sql.functions: Repeats a string column n times, and returns it as a new string column.
replace(String, Map<T, T>) - Method in class org.apache.spark.sql.DataFrameNaFunctions: Replaces values matching keys in replacement map with the corresponding values.
replace(String[], Map<T, T>) - Method in class org.apache.spark.sql.DataFrameNaFunctions: Replaces values matching keys in replacement map with the corresponding values.
replace(String, Map<T, T>) - Method in class org.apache.spark.sql.DataFrameNaFunctions: (Scala-specific) Replaces values matching keys in replacement map.
replace(Seq<String>, Map<T, T>) - Method in class org.apache.spark.sql.DataFrameNaFunctions: (Scala-specific) Replaces values matching keys in replacement map.
replicatedVertexView() - Method in class org.apache.spark.graphx.impl.GraphImpl
replication() - Method in class org.apache.spark.storage.StorageLevel
reportError(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver: Report exceptions in receiving data.
requestExecutors(int) - Method in class org.apache.spark.SparkContext: :: DeveloperApi :: Request an additional number of executors from the cluster manager.
reset() - Method in class org.apache.spark.storage.BufferReleasingInputStream
resetIterator() - Method in class org.apache.spark.rdd.PartitionCoalescer.LocationIterator
residuals() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary: Residuals (label - predicted value)
resolve(String) - Method in class org.apache.spark.sql.DataFrame
resolvedTEncoder() - Method in class org.apache.spark.sql.Dataset: The encoder for this Dataset that has been resolved to its output schema.
restart(String) - Method in class org.apache.spark.streaming.receiver.Receiver: Restart the receiver.
restart(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver: Restart the receiver.
restart(String, Throwable, int) - Method in class org.apache.spark.streaming.receiver.Receiver: Restart the receiver.
Resubmitted - Class in org.apache.spark: :: DeveloperApi :: A ShuffleMapTask that completed successfully earlier, but we lost the executor before the stage completed.
Resubmitted() - Constructor for class org.apache.spark.Resubmitted
result(Duration, CanAwait) - Method in class org.apache.spark.ComplexFutureAction
result(Duration, CanAwait) - Method in interface org.apache.spark.FutureAction: Awaits and returns the result (of type T) of this action.
result(Duration, CanAwait) - Method in class org.apache.spark.SimpleFutureAction
resultSerializationTime() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
resultSerializationTime() - Method in class org.apache.spark.status.api.v1.TaskMetrics
resultSetToObjectArray(ResultSet) - Static method in class org.apache.spark.rdd.JdbcRDD
resultSize() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
resultSize() - Method in class org.apache.spark.status.api.v1.TaskMetrics
retainedJobs() - Method in class org.apache.spark.ui.jobs.JobProgressListener
retainedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
retryWaitMs(SparkConf) - Static method in class org.apache.spark.util.RpcUtils: Returns the configured number of milliseconds to wait on each retry
ReturnStatementFinder - Class in org.apache.spark.util
ReturnStatementFinder() - Constructor for class org.apache.spark.util.ReturnStatementFinder
reverse() - Method in class org.apache.spark.graphx.EdgeDirection: Reverse the direction of an edge.
reverse() - Method in class org.apache.spark.graphx.EdgeRDD: Reverse all the edges in this RDD.
reverse() - Method in class org.apache.spark.graphx.Graph: Reverses all edges in the graph.
reverse() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
reverse() - Method in class org.apache.spark.graphx.impl.GraphImpl
reverse(Column) - Static method in class org.apache.spark.sql.functions: Reverses the string column and returns it as a new string column.
reverseRoutingTables() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
reverseRoutingTables() - Method in class org.apache.spark.graphx.VertexRDD: Returns a new VertexRDD reflecting a reversal of all edge directions in the corresponding EdgeRDD.
ReviveOffers - Class in org.apache.spark.scheduler.local
ReviveOffers() - Constructor for class org.apache.spark.scheduler.local.ReviveOffers
RFormula - Class in org.apache.spark.ml.feature: :: Experimental :: Implements the transforms required for fitting a dataset against an R model formula.
RFormula(String) - Constructor for class org.apache.spark.ml.feature.RFormula
RFormula() - Constructor for class org.apache.spark.ml.feature.RFormula
RFormulaModel - Class in org.apache.spark.ml.feature: :: Experimental :: A fitted RFormula.
RidgeRegressionModel - Class in org.apache.spark.mllib.regression: Regression model trained using RidgeRegression.
RidgeRegressionModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.RidgeRegressionModel
RidgeRegressionWithSGD - Class in org.apache.spark.mllib.regression: Train a regression model with L2-regularization using Stochastic Gradient Descent.
RidgeRegressionWithSGD() - Constructor for class org.apache.spark.mllib.regression.RidgeRegressionWithSGD: Construct a RidgeRegression object with default parameters: {stepSize: 1.0, numIterations: 100, regParam: 0.01, miniBatchFraction: 1.0}.
right() - Method in class org.apache.spark.sql.sources.And
right() - Method in class org.apache.spark.sql.sources.Or
rightCategories() - Method in class org.apache.spark.ml.tree.CategoricalSplit: Get sorted categories which split to the right
rightChild() - Method in class org.apache.spark.ml.tree.InternalNode
rightChildIndex(int) - Static method in class org.apache.spark.mllib.tree.model.Node: Return the index of the right child of this node.
rightImpurity() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
rightNode() - Method in class org.apache.spark.mllib.tree.model.Node
rightOuterJoin(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Perform a right outer join of this and other.
rightOuterJoin(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD: Perform a right outer join of this and other.
rightOuterJoin(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Perform a right outer join of this and other.
rightOuterJoin(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions: Perform a right outer join of this and other.
rightOuterJoin(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Perform a right outer join of this and other.
rightOuterJoin(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions: Perform a right outer join of this and other.
rightOuterJoin(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightOuterJoin(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightOuterJoin(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightOuterJoin(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightOuterJoin(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightOuterJoin(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new DStream by applying 'right outer join' between RDDs of this DStream and other DStream.
rightPredict() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
rint(Column) - Static method in class org.apache.spark.sql.functions: Returns the double value that is closest in value to the argument and is equal to a mathematical integer.
rint(String) - Static method in class org.apache.spark.sql.functions: Returns the double value that is closest in value to the argument and is equal to a mathematical integer.
rlike(String) - Method in class org.apache.spark.sql.Column: SQL RLIKE expression (LIKE with Regex).
RMATa() - Static method in class org.apache.spark.graphx.util.GraphGenerators
RMATb() - Static method in class org.apache.spark.graphx.util.GraphGenerators
RMATc() - Static method in class org.apache.spark.graphx.util.GraphGenerators
RMATd() - Static method in class org.apache.spark.graphx.util.GraphGenerators
rmatGraph(SparkContext, int, int) - Static method in class org.apache.spark.graphx.util.GraphGenerators: A random graph generator using the R-MAT model, proposed in "R-MAT: A Recursive Model for Graph Mining" by Chakrabarti et al.
rnd() - Method in class org.apache.spark.rdd.PartitionCoalescer
roc() - Method in class org.apache.spark.ml.classification.BinaryLogisticRegressionSummary: Returns the receiver operating characteristic (ROC) curve, which is an Dataframe having two fields (FPR, TPR) with (0.0, 0.0) prepended and (1.0, 1.0) appended to it.
roc() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Returns the receiver operating characteristic (ROC) curve, which is an RDD of (false positive rate, true positive rate) with (0.0, 0.0) prepended and (1.0, 1.0) appended to it.
rollup(Column...) - Method in class org.apache.spark.sql.DataFrame: Create a multi-dimensional rollup for the current DataFrame using the specified columns, so we can run aggregation on them.
rollup(String, String...) - Method in class org.apache.spark.sql.DataFrame: Create a multi-dimensional rollup for the current DataFrame using the specified columns, so we can run aggregation on them.
rollup(Seq<Column>) - Method in class org.apache.spark.sql.DataFrame: Create a multi-dimensional rollup for the current DataFrame using the specified columns, so we can run aggregation on them.
rollup(String, Seq<String>) - Method in class org.apache.spark.sql.DataFrame: Create a multi-dimensional rollup for the current DataFrame using the specified columns, so we can run aggregation on them.
root() - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel
rootMeanSquaredError() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary: Returns the root mean squared error, which is defined as the square root of the mean squared error.
rootMeanSquaredError() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics: Returns the root mean squared error, which is defined as the square root of the mean squared error.
rootNode() - Method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
rootNode() - Method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
round(Column) - Static method in class org.apache.spark.sql.functions: Returns the value of the column e rounded to 0 decimal places.
round(Column, int) - Static method in class org.apache.spark.sql.functions: Round the value of e to scale decimal places if scale >= 0 or at integral part when scale < 0.
ROUND_CEILING() - Static method in class org.apache.spark.sql.types.Decimal
ROUND_FLOOR() - Static method in class org.apache.spark.sql.types.Decimal
ROUND_HALF_UP() - Static method in class org.apache.spark.sql.types.Decimal
Row - Interface in org.apache.spark.sql: Represents one row of output from a relational operator.
row_number() - Static method in class org.apache.spark.sql.functions: Window function: returns a sequential number starting at 1 within a window partition.
RowFactory - Class in org.apache.spark.sql: A factory class used to construct Row objects.
RowFactory() - Constructor for class org.apache.spark.sql.RowFactory
rowIndices() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
RowMatrix - Class in org.apache.spark.mllib.linalg.distributed: Represents a row-oriented distributed Matrix with no meaningful row indices.
RowMatrix(RDD<Vector>, long, int) - Constructor for class org.apache.spark.mllib.linalg.distributed.RowMatrix
RowMatrix(RDD<Vector>) - Constructor for class org.apache.spark.mllib.linalg.distributed.RowMatrix: Alternative constructor leaving matrix dimensions to be determined automatically.
rowNumber() - Static method in class org.apache.spark.sql.functions: Deprecated.
As of 1.6.0, replaced by row_number. This will be removed in Spark 2.0.
rows() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
rows() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
rowsBetween(long, long) - Method in class org.apache.spark.sql.expressions.WindowSpec: Defines the frame boundaries, from start (inclusive) to end (inclusive).
rowsPerBlock() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
rpad(Column, int, String) - Static method in class org.apache.spark.sql.functions: Right-padded with pad to a length of len.
rpcEnv() - Method in class org.apache.spark.SparkEnv
RpcUtils - Class in org.apache.spark.util
RpcUtils() - Constructor for class org.apache.spark.util.RpcUtils
RRDD<T> - Class in org.apache.spark.api.r: An RDD that stores serialized R objects as Array[Byte].
RRDD(RDD<T>, byte[], String, String, byte[], Object[], ClassTag<T>) - Constructor for class org.apache.spark.api.r.RRDD
rtrim(Column) - Static method in class org.apache.spark.sql.functions: Trim the spaces from right end for the specified string value.
run(Function0<T>, ExecutionContext) - Method in class org.apache.spark.ComplexFutureAction: Executes some action enclosed in the closure.
run(Graph<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.ConnectedComponents: Compute the connected component membership of each vertex and return a graph with the vertex value containing the lowest vertex id in the connected component containing that vertex.
run(Graph<VD, ED>, int, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.LabelPropagation: Run static Label Propagation for detecting communities in networks.
run(Graph<VD, ED>, int, double, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.PageRank: Run PageRank for a fixed number of iterations returning a graph with vertex attributes containing the PageRank and edge attributes the normalized edge weight.
run(Graph<VD, ED>, Seq<Object>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.ShortestPaths: Computes shortest paths to the given set of landmark vertices.
run(Graph<VD, ED>, int, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.StronglyConnectedComponents: Compute the strongly connected component (SCC) of each vertex and return a graph with the vertex value containing the lowest vertex id in the SCC containing that vertex.
run(RDD<Edge<Object>>, SVDPlusPlus.Conf) - Static method in class org.apache.spark.graphx.lib.SVDPlusPlus: Implement SVD++ based on "Factorization Meets the Neighborhood: a Multifaceted Collaborative Filtering Model", available at http://public.research.att.com/~volinsky/netflix/kdd08koren.pdf.
run(Graph<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.TriangleCount
run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.classification.NaiveBayes
run(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.BisectingKMeans: Runs the bisecting k-means algorithm.
run(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.clustering.BisectingKMeans: Java-friendly version of run().
run(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.GaussianMixture: Perform expectation maximization
run(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.clustering.GaussianMixture: Java-friendly version of run()
run(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeans: Train a K-means model on the given set of points; data should be cached for high performance, because this is an iterative algorithm.
run(RDD<Tuple2<Object, Vector>>) - Method in class org.apache.spark.mllib.clustering.LDA: Learn an LDA model using the given dataset.
run(JavaPairRDD<Long, Vector>) - Method in class org.apache.spark.mllib.clustering.LDA: Java-friendly version of run()
run(Graph<Object, Object>) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering: Run the PIC algorithm on Graph.
run(RDD<Tuple3<Object, Object, Object>>) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering: Run the PIC algorithm.
run(JavaRDD<Tuple3<Long, Long, Double>>) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering: A Java-friendly version of PowerIterationClustering.run.
run(RDD<FPGrowth.FreqItemset<Item>>, ClassTag<Item>) - Method in class org.apache.spark.mllib.fpm.AssociationRules: Computes the association rules with confidence above minConfidence.
run(JavaRDD<FPGrowth.FreqItemset<Item>>) - Method in class org.apache.spark.mllib.fpm.AssociationRules: Java-friendly version of run.
run(RDD<Object>, ClassTag<Item>) - Method in class org.apache.spark.mllib.fpm.FPGrowth: Computes an FP-Growth model that contains frequent itemsets.
run(JavaRDD<Basket>) - Method in class org.apache.spark.mllib.fpm.FPGrowth: Java-friendly version of run.
run(RDD<Object[]>, ClassTag<Item>) - Method in class org.apache.spark.mllib.fpm.PrefixSpan: Finds the complete set of frequent sequential patterns in the input sequences of itemsets.
run(JavaRDD<Sequence>) - Method in class org.apache.spark.mllib.fpm.PrefixSpan: A Java-friendly version of run() that reads sequences from a JavaRDD and returns frequent sequences in a PrefixSpanModel.
run(RDD<Rating>) - Method in class org.apache.spark.mllib.recommendation.ALS
run(JavaRDD<Rating>) - Method in class org.apache.spark.mllib.recommendation.ALS
run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm: Run the algorithm with the configured parameters on an input RDD of LabeledPoint entries.
run(RDD<LabeledPoint>, Vector) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm: Run the algorithm with the configured parameters on an input RDD of LabeledPoint entries starting from the initial weights provided.
run(RDD<Tuple3<Object, Object, Object>>) - Method in class org.apache.spark.mllib.regression.IsotonicRegression
run(JavaRDD<Tuple3<Double, Double, Double>>) - Method in class org.apache.spark.mllib.regression.IsotonicRegression
run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.DecisionTree: Method to train a decision tree model over an RDD
run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.GradientBoostedTrees: Method to train a gradient boosting model
run(JavaRDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.GradientBoostedTrees: Java-friendly API for org.apache.spark.mllib.tree.GradientBoostedTrees!#run.
run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.RandomForest: Method to train a decision tree model over an RDD
run() - Method in class org.apache.spark.rdd.PartitionCoalescer: Runs the packing algorithm and returns an array of PartitionGroups that if possible are load balanced and grouped by locality
run() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformationWriterThread
run() - Method in class org.apache.spark.util.SparkShutdownHook
runApproximateJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, <any>, long) - Method in class org.apache.spark.SparkContext: :: DeveloperApi :: Run a job that can return approximate results.
runJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, Function2<Object, U, BoxedUnit>, Function0<R>) - Method in class org.apache.spark.ComplexFutureAction: Runs a Spark job.
runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Seq<Object>, Function2<Object, U, BoxedUnit>, ClassTag) - Method in class org.apache.spark.SparkContext: Run a function on a given set of partitions in an RDD and pass the results to the given handler function.
runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Seq<Object>, ClassTag) - Method in class org.apache.spark.SparkContext: Run a function on a given set of partitions in an RDD and return the results as an array.
runJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, ClassTag) - Method in class org.apache.spark.SparkContext: Run a job on a given set of partitions of an RDD, but take a function of type Iterator[T] => U instead of (TaskContext, Iterator[T]) => U.
runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Seq<Object>, boolean, Function2<Object, U, BoxedUnit>, ClassTag) - Method in class org.apache.spark.SparkContext: Run a function on a given set of partitions in an RDD and pass the results to the given handler function.
runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Seq<Object>, boolean, ClassTag) - Method in class org.apache.spark.SparkContext: Run a function on a given set of partitions in an RDD and return the results as an array.
runJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, boolean, ClassTag) - Method in class org.apache.spark.SparkContext: Run a job on a given set of partitions of an RDD, but take a function of type Iterator[T] => U instead of (TaskContext, Iterator[T]) => U.
runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, ClassTag) - Method in class org.apache.spark.SparkContext: Run a job on all partitions in an RDD and return the results in an array.
runJob(RDD<T>, Function1<Iterator<T>, U>, ClassTag) - Method in class org.apache.spark.SparkContext: Run a job on all partitions in an RDD and return the results in an array.
runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Function2<Object, U, BoxedUnit>, ClassTag) - Method in class org.apache.spark.SparkContext: Run a job on all partitions in an RDD and pass the results to a handler function.
runJob(RDD<T>, Function1<Iterator<T>, U>, Function2<Object, U, BoxedUnit>, ClassTag) - Method in class org.apache.spark.SparkContext: Run a job on all partitions in an RDD and pass the results to a handler function.
runLBFGS(RDD<Tuple2<Object, Vector>>, Gradient, Updater, int, double, int, double, Vector) - Static method in class org.apache.spark.mllib.optimization.LBFGS: Run Limited-memory BFGS (L-BFGS) in parallel.
runMiniBatchSGD(RDD<Tuple2<Object, Vector>>, Gradient, Updater, double, int, double, double, Vector, double) - Static method in class org.apache.spark.mllib.optimization.GradientDescent: Run stochastic gradient descent (SGD) in parallel using mini batches.
runMiniBatchSGD(RDD<Tuple2<Object, Vector>>, Gradient, Updater, double, int, double, double, Vector) - Static method in class org.apache.spark.mllib.optimization.GradientDescent: Alias of runMiniBatchSGD with convergenceTol set to default value of 0.001.
running() - Method in class org.apache.spark.scheduler.TaskInfo
runningLocally() - Method in class org.apache.spark.TaskContext
runSqlHive(String) - Method in class org.apache.spark.sql.hive.HiveContext
runSVDPlusPlus(RDD<Edge<Object>>, SVDPlusPlus.Conf) - Static method in class org.apache.spark.graphx.lib.SVDPlusPlus: This method is now replaced by the updated version of run() and returns exactly the same result.
RuntimePercentage - Class in org.apache.spark.scheduler
RuntimePercentage(double, Option<Object>, double) - Constructor for class org.apache.spark.scheduler.RuntimePercentage
runUntilConvergence(Graph<VD, ED>, double, double, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.PageRank: Run a dynamic version of PageRank returning a graph with vertex attributes containing the PageRank and edge attributes containing the normalized edge weight.
runUntilConvergenceWithOptions(Graph<VD, ED>, double, double, Option<Object>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.PageRank: Run a dynamic version of PageRank returning a graph with vertex attributes containing the PageRank and edge attributes containing the normalized edge weight.
runWithOptions(Graph<VD, ED>, int, double, Option<Object>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.PageRank: Run PageRank for a fixed number of iterations returning a graph with vertex attributes containing the PageRank and edge attributes the normalized edge weight.
runWithValidation(RDD<LabeledPoint>, RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.GradientBoostedTrees: Method to validate a gradient boosting model
runWithValidation(JavaRDD<LabeledPoint>, JavaRDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.GradientBoostedTrees: Java-friendly API for org.apache.spark.mllib.tree.GradientBoostedTrees!#runWithValidation.

S

s() - Method in class org.apache.spark.mllib.linalg.SingularValueDecomposition
sample(boolean, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a sampled subset of this RDD.
sample(boolean, Double, long) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a sampled subset of this RDD.
sample(boolean, double) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a sampled subset of this RDD.
sample(boolean, double, long) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a sampled subset of this RDD.
sample(boolean, double) - Method in class org.apache.spark.api.java.JavaRDD: Return a sampled subset of this RDD.
sample(boolean, double, long) - Method in class org.apache.spark.api.java.JavaRDD: Return a sampled subset of this RDD.
sample(boolean, double, long) - Method in class org.apache.spark.rdd.RDD: Return a sampled subset of this RDD.
sample(boolean, double, long) - Method in class org.apache.spark.sql.DataFrame: Returns a new DataFrame by sampling a fraction of rows.
sample(boolean, double) - Method in class org.apache.spark.sql.DataFrame: Returns a new DataFrame by sampling a fraction of rows, using a random seed.
sample(boolean, double, long) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset by sampling a fraction of records.
sample(boolean, double) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset by sampling a fraction of records, using a random seed.
sample(Iterator<T>) - Method in class org.apache.spark.util.random.BernoulliCellSampler
sample(Iterator<T>) - Method in class org.apache.spark.util.random.BernoulliSampler
sample(Iterator<T>) - Method in class org.apache.spark.util.random.PoissonSampler
sample(Iterator<T>) - Method in interface org.apache.spark.util.random.RandomSampler: take a random sample
sampleBy(String, Map<T, Object>, long) - Method in class org.apache.spark.sql.DataFrameStatFunctions: Returns a stratified sample without replacement based on the fraction given on each stratum.
sampleBy(String, Map<T, Double>, long) - Method in class org.apache.spark.sql.DataFrameStatFunctions: Returns a stratified sample without replacement based on the fraction given on each stratum.
sampleByKey(boolean, Map<K, Object>, long) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a subset of this RDD sampled by key (via stratified sampling).
sampleByKey(boolean, Map<K, Object>) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a subset of this RDD sampled by key (via stratified sampling).
sampleByKey(boolean, Map<K, Object>, long) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return a subset of this RDD sampled by key (via stratified sampling).
sampleByKeyExact(boolean, Map<K, Object>, long) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a subset of this RDD sampled by key (via stratified sampling) containing exactly math.ceil(numItems * samplingRate) for each stratum (group of pairs with the same key).
sampleByKeyExact(boolean, Map<K, Object>) - Method in class org.apache.spark.api.java.JavaPairRDD: Return a subset of this RDD sampled by key (via stratified sampling) containing exactly math.ceil(numItems * samplingRate) for each stratum (group of pairs with the same key).
sampleByKeyExact(boolean, Map<K, Object>, long) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return a subset of this RDD sampled by key (via stratified sampling) containing exactly math.ceil(numItems * samplingRate) for each stratum (group of pairs with the same key).
sampleStdev() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Compute the sample standard deviation of this RDD's elements (which corrects for bias in estimating the standard deviation by dividing by N-1 instead of N).
sampleStdev() - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Compute the sample standard deviation of this RDD's elements (which corrects for bias in estimating the standard deviation by dividing by N-1 instead of N).
sampleStdev() - Method in class org.apache.spark.util.StatCounter: Return the sample standard deviation of the values, which corrects for bias in estimating the variance by dividing by N-1 instead of N.
sampleVariance() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Compute the sample variance of this RDD's elements (which corrects for bias in estimating the standard variance by dividing by N-1 instead of N).
sampleVariance() - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Compute the sample variance of this RDD's elements (which corrects for bias in estimating the variance by dividing by N-1 instead of N).
sampleVariance() - Method in class org.apache.spark.util.StatCounter: Return the sample variance, which corrects for bias in estimating the variance by dividing by N-1 instead of N.
save(String) - Method in interface org.apache.spark.ml.util.MLWritable: Saves this ML instance to the input path, a shortcut of write.save(path).
save(String) - Method in class org.apache.spark.ml.util.MLWriter: Saves the ML instances to the input path.
save(SparkContext, String) - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
save(SparkContext, String) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
save(SparkContext, String) - Method in class org.apache.spark.mllib.classification.SVMModel
save(SparkContext, String) - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel: Java-friendly version of topicDistributions
save(SparkContext, String) - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
save(SparkContext, String) - Method in class org.apache.spark.mllib.clustering.KMeansModel
save(SparkContext, String) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
save(SparkContext, String) - Method in class org.apache.spark.mllib.clustering.PowerIterationClusteringModel
save(SparkContext, String) - Method in class org.apache.spark.mllib.feature.ChiSqSelectorModel
save(SparkContext, String) - Method in class org.apache.spark.mllib.feature.Word2VecModel
save(SparkContext, String) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel: Save this model to the given path.
save(SparkContext, String) - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
save(SparkContext, String) - Method in class org.apache.spark.mllib.regression.LassoModel
save(SparkContext, String) - Method in class org.apache.spark.mllib.regression.LinearRegressionModel
save(SparkContext, String) - Method in class org.apache.spark.mllib.regression.RidgeRegressionModel
save(SparkContext, String) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
save(SparkContext, String) - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
save(SparkContext, String) - Method in class org.apache.spark.mllib.tree.model.RandomForestModel
save(SparkContext, String) - Method in interface org.apache.spark.mllib.util.Saveable: Save this model to the given path.
save(String) - Method in class org.apache.spark.sql.DataFrame: Deprecated.
As of 1.4.0, replaced by write().save(path). This will be removed in Spark 2.0.
save(String, SaveMode) - Method in class org.apache.spark.sql.DataFrame: Deprecated.
As of 1.4.0, replaced by write().mode(mode).save(path). This will be removed in Spark 2.0.
save(String, String) - Method in class org.apache.spark.sql.DataFrame: Deprecated.
As of 1.4.0, replaced by write().format(source).save(path). This will be removed in Spark 2.0.
save(String, String, SaveMode) - Method in class org.apache.spark.sql.DataFrame: Deprecated.
As of 1.4.0, replaced by write().format(source).mode(mode).save(path). This will be removed in Spark 2.0.
save(String, SaveMode, Map<String, String>) - Method in class org.apache.spark.sql.DataFrame: Deprecated.
As of 1.4.0, replaced by write().format(source).mode(mode).options(options).save(path). This will be removed in Spark 2.0.
save(String, SaveMode, Map<String, String>) - Method in class org.apache.spark.sql.DataFrame: Deprecated.
As of 1.4.0, replaced by write().format(source).mode(mode).options(options).save(path). This will be removed in Spark 2.0.
save(String) - Method in class org.apache.spark.sql.DataFrameWriter: Saves the content of the DataFrame at the specified path.
save() - Method in class org.apache.spark.sql.DataFrameWriter: Saves the content of the DataFrame as the specified table.
Saveable - Interface in org.apache.spark.mllib.util: :: DeveloperApi ::
saveAsHadoopDataset(JobConf) - Method in class org.apache.spark.api.java.JavaPairRDD: Output the RDD to any Hadoop-supported storage system, using a Hadoop JobConf object for that storage system.
saveAsHadoopDataset(JobConf) - Method in class org.apache.spark.rdd.PairRDDFunctions: Output the RDD to any Hadoop-supported storage system, using a Hadoop JobConf object for that storage system.
saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>, JobConf) - Method in class org.apache.spark.api.java.JavaPairRDD: Output the RDD to any Hadoop-supported file system.
saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.api.java.JavaPairRDD: Output the RDD to any Hadoop-supported file system.
saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>, Class<? extends CompressionCodec>) - Method in class org.apache.spark.api.java.JavaPairRDD: Output the RDD to any Hadoop-supported file system, compressing with the supplied codec.
saveAsHadoopFile(String, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat class supporting the key and value types K and V in this RDD.
saveAsHadoopFile(String, Class<? extends CompressionCodec>, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat class supporting the key and value types K and V in this RDD.
saveAsHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Class<? extends CompressionCodec>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat class supporting the key and value types K and V in this RDD.
saveAsHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, JobConf, Option<Class<? extends CompressionCodec>>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat class supporting the key and value types K and V in this RDD.
saveAsHadoopFiles(String, String) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Save each RDD in this DStream as a Hadoop file.
saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Save each RDD in this DStream as a Hadoop file.
saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<F>, JobConf) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Save each RDD in this DStream as a Hadoop file.
saveAsHadoopFiles(String, String, ClassTag<F>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Save each RDD in this DStream as a Hadoop file.
saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, JobConf) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Save each RDD in this DStream as a Hadoop file.
saveAsLibSVMFile(RDD<LabeledPoint>, String) - Static method in class org.apache.spark.mllib.util.MLUtils: Save labeled data in LIBSVM format.
saveAsNewAPIHadoopDataset(Configuration) - Method in class org.apache.spark.api.java.JavaPairRDD: Output the RDD to any Hadoop-supported storage system, using a Configuration object for that storage system.
saveAsNewAPIHadoopDataset(Configuration) - Method in class org.apache.spark.rdd.PairRDDFunctions: Output the RDD to any Hadoop-supported storage system with new Hadoop API, using a Hadoop Configuration object for that storage system.
saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<F>, Configuration) - Method in class org.apache.spark.api.java.JavaPairRDD: Output the RDD to any Hadoop-supported file system.
saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.api.java.JavaPairRDD: Output the RDD to any Hadoop-supported file system.
saveAsNewAPIHadoopFile(String, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Output the RDD to any Hadoop-supported file system, using a new Hadoop API OutputFormat (mapreduce.OutputFormat) object supporting the key and value types K and V in this RDD.
saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Configuration) - Method in class org.apache.spark.rdd.PairRDDFunctions: Output the RDD to any Hadoop-supported file system, using a new Hadoop API OutputFormat (mapreduce.OutputFormat) object supporting the key and value types K and V in this RDD.
saveAsNewAPIHadoopFiles(String, String) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Save each RDD in this DStream as a Hadoop file.
saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Save each RDD in this DStream as a Hadoop file.
saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<F>, Configuration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Save each RDD in this DStream as a Hadoop file.
saveAsNewAPIHadoopFiles(String, String, ClassTag<F>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Save each RDD in this DStream as a Hadoop file.
saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Configuration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Save each RDD in this DStream as a Hadoop file.
saveAsObjectFile(String) - Method in interface org.apache.spark.api.java.JavaRDDLike: Save this RDD as a SequenceFile of serialized objects.
saveAsObjectFile(String) - Method in class org.apache.spark.rdd.RDD: Save this RDD as a SequenceFile of serialized objects.
saveAsObjectFiles(String, String) - Method in class org.apache.spark.streaming.dstream.DStream: Save each RDD in this DStream as a Sequence file of serialized objects.
saveAsParquetFile(String) - Method in class org.apache.spark.sql.DataFrame: Deprecated.
As of 1.4.0, replaced by write().parquet(). This will be removed in Spark 2.0.
saveAsSequenceFile(String, Option<Class<? extends CompressionCodec>>) - Method in class org.apache.spark.rdd.SequenceFileRDDFunctions: Output the RDD as a Hadoop SequenceFile using the Writable types we infer from the RDD's key and value types.
saveAsTable(String) - Method in class org.apache.spark.sql.DataFrame: Deprecated.
As of 1.4.0, replaced by write().saveAsTable(tableName). This will be removed in Spark 2.0.
saveAsTable(String, SaveMode) - Method in class org.apache.spark.sql.DataFrame: Deprecated.
As of 1.4.0, replaced by write().mode(mode).saveAsTable(tableName). This will be removed in Spark 2.0.
saveAsTable(String, String) - Method in class org.apache.spark.sql.DataFrame: Deprecated.
As of 1.4.0, replaced by write().format(source).saveAsTable(tableName). This will be removed in Spark 2.0.
saveAsTable(String, String, SaveMode) - Method in class org.apache.spark.sql.DataFrame: Deprecated.
As of 1.4.0, replaced by write().mode(mode).saveAsTable(tableName). This will be removed in Spark 2.0.
saveAsTable(String, String, SaveMode, Map<String, String>) - Method in class org.apache.spark.sql.DataFrame: Deprecated.
As of 1.4.0, replaced by write().format(source).mode(mode).options(options).saveAsTable(tableName). This will be removed in Spark 2.0.
saveAsTable(String, String, SaveMode, Map<String, String>) - Method in class org.apache.spark.sql.DataFrame: Deprecated.
As of 1.4.0, replaced by write().format(source).mode(mode).options(options).saveAsTable(tableName). This will be removed in Spark 2.0.
saveAsTable(String) - Method in class org.apache.spark.sql.DataFrameWriter: Saves the content of the DataFrame as the specified table.
saveAsTextFile(String) - Method in interface org.apache.spark.api.java.JavaRDDLike: Save this RDD as a text file, using string representations of elements.
saveAsTextFile(String, Class<? extends CompressionCodec>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Save this RDD as a compressed text file, using string representations of elements.
saveAsTextFile(String) - Method in class org.apache.spark.rdd.RDD: Save this RDD as a text file, using string representations of elements.
saveAsTextFile(String, Class<? extends CompressionCodec>) - Method in class org.apache.spark.rdd.RDD: Save this RDD as a compressed text file, using string representations of elements.
saveAsTextFiles(String, String) - Method in class org.apache.spark.streaming.dstream.DStream: Save each RDD in this DStream as at text file, using string representation of elements.
saveImpl(String) - Method in class org.apache.spark.ml.util.MLWriter: save() handles overwriting and then calls this method.
saveLabeledData(RDD<LabeledPoint>, String) - Static method in class org.apache.spark.mllib.util.MLUtils: Deprecated.
Should use RDD.saveAsTextFile(java.lang.String) for saving and MLUtils.loadLabeledPoints(org.apache.spark.SparkContext, java.lang.String, int) for loading.
SaveMode - Enum in org.apache.spark.sql: SaveMode is used to specify the expected behavior of saving a DataFrame to a data source.
sc() - Method in class org.apache.spark.api.java.JavaSparkContext
sc() - Method in class org.apache.spark.sql.SQLContext.implicits$.StringToColumn
sc() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Deprecated.
As of 0.9.0, replaced by sparkContext
sc() - Method in class org.apache.spark.streaming.StreamingContext
scalaIntToJavaLong(DStream<Object>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
scalaToJavaLong(JavaPairDStream<K, Object>, ClassTag<K>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
scale() - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
scale() - Method in class org.apache.spark.mllib.random.GammaGenerator
scale() - Method in class org.apache.spark.sql.types.Decimal
scale() - Method in class org.apache.spark.sql.types.DecimalType
scale() - Method in class org.apache.spark.sql.types.PrecisionInfo
scalingVec() - Method in class org.apache.spark.ml.feature.ElementwiseProduct: the vector to multiply with input vectors
scalingVec() - Method in class org.apache.spark.mllib.feature.ElementwiseProduct
scheduler() - Method in class org.apache.spark.streaming.StreamingContext
schedulingDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo: Time taken for the first job of this batch to start processing from the time this batch was submitted to the streaming scheduler.
SchedulingMode - Class in org.apache.spark.scheduler: "FAIR" and "FIFO" determines which policy is used to order tasks amongst a Schedulable's sub-queues "NONE" is used when the a Schedulable has no sub-queues.
SchedulingMode() - Constructor for class org.apache.spark.scheduler.SchedulingMode
schedulingMode() - Method in class org.apache.spark.ui.jobs.JobProgressListener
schedulingPool() - Method in class org.apache.spark.status.api.v1.StageData
schema() - Method in class org.apache.spark.sql.DataFrame: Returns the schema of this DataFrame.
schema(StructType) - Method in class org.apache.spark.sql.DataFrameReader: Specifies the input schema.
schema() - Method in class org.apache.spark.sql.Dataset: Returns the schema of the encoded form of the objects in this Dataset.
schema() - Method in interface org.apache.spark.sql.Encoder: Returns the schema of encoding this type of object as a Row.
schema() - Method in interface org.apache.spark.sql.Row: Schema for the row.
schema() - Method in class org.apache.spark.sql.sources.BaseRelation
schema() - Method in class org.apache.spark.sql.sources.HadoopFsRelation: Schema of this relation.
SchemaRelationProvider - Interface in org.apache.spark.sql.sources: ::DeveloperApi:: Implemented by objects that produce relations for a specific kind of data source with a given schema.
scope() - Method in class org.apache.spark.rdd.RDD: The scope associated with the operation that created this RDD.
scope() - Method in class org.apache.spark.storage.RDDInfo
scoreAndLabels() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
ScriptTransformationWriterThread - Class in org.apache.spark.sql.hive.execution
ScriptTransformationWriterThread(Iterator<InternalRow>, Seq<DataType>, org.apache.spark.sql.catalyst.expressions.Projection, AbstractSerDe, ObjectInspector, HiveScriptIOSchema, OutputStream, Process, org.apache.spark.util.CircularBuffer, TaskContext, Configuration) - Constructor for class org.apache.spark.sql.hive.execution.ScriptTransformationWriterThread
second(Column) - Static method in class org.apache.spark.sql.functions: Extracts the seconds as an integer from a given date/timestamp/string.
seconds() - Static method in class org.apache.spark.scheduler.StatsReportListener
seconds(long) - Static method in class org.apache.spark.streaming.Durations
Seconds - Class in org.apache.spark.streaming: Helper object that creates instance of Duration representing a given number of seconds.
Seconds() - Constructor for class org.apache.spark.streaming.Seconds
securityManager() - Method in class org.apache.spark.SparkEnv
select(Column...) - Method in class org.apache.spark.sql.DataFrame: Selects a set of column based expressions.
select(String, String...) - Method in class org.apache.spark.sql.DataFrame: Selects a set of columns.
select(Seq<Column>) - Method in class org.apache.spark.sql.DataFrame: Selects a set of column based expressions.
select(String, Seq<String>) - Method in class org.apache.spark.sql.DataFrame: Selects a set of columns.
select(Column...) - Method in class org.apache.spark.sql.Dataset: Returns a new DataFrame by selecting a set of column based expressions.
select(Seq<Column>) - Method in class org.apache.spark.sql.Dataset: Returns a new DataFrame by selecting a set of column based expressions.
select(TypedColumn<T, U1>, Encoder<U1>) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset by computing the given Column expression for each element.
select(TypedColumn<T, U1>, TypedColumn<T, U2>) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset by computing the given Column expressions for each element.
select(TypedColumn<T, U1>, TypedColumn<T, U2>, TypedColumn<T, U3>) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset by computing the given Column expressions for each element.
select(TypedColumn<T, U1>, TypedColumn<T, U2>, TypedColumn<T, U3>, TypedColumn<T, U4>) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset by computing the given Column expressions for each element.
select(TypedColumn<T, U1>, TypedColumn<T, U2>, TypedColumn<T, U3>, TypedColumn<T, U4>, TypedColumn<T, U5>) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset by computing the given Column expressions for each element.
selectedFeatures() - Method in class org.apache.spark.ml.feature.ChiSqSelectorModel
selectedFeatures() - Method in class org.apache.spark.mllib.feature.ChiSqSelectorModel
selectExpr(String...) - Method in class org.apache.spark.sql.DataFrame: Selects a set of SQL expressions.
selectExpr(Seq<String>) - Method in class org.apache.spark.sql.DataFrame: Selects a set of SQL expressions.
selectUntyped(Seq<TypedColumn<?, ?>>) - Method in class org.apache.spark.sql.Dataset: Internal helper function for building typed selects that return tuples.
sendToDst(A) - Method in class org.apache.spark.graphx.EdgeContext: Sends a message to the destination vertex.
sendToDst(A) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
sendToSrc(A) - Method in class org.apache.spark.graphx.EdgeContext: Sends a message to the source vertex.
sendToSrc(A) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
sequence() - Method in class org.apache.spark.mllib.fpm.PrefixSpan.FreqSequence
sequenceFile(String, Class<K>, Class<V>, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Get an RDD for a Hadoop SequenceFile with given key and value types.
sequenceFile(String, Class<K>, Class<V>) - Method in class org.apache.spark.api.java.JavaSparkContext: Get an RDD for a Hadoop SequenceFile.
sequenceFile(String, Class<K>, Class<V>, int) - Method in class org.apache.spark.SparkContext: Get an RDD for a Hadoop SequenceFile with given key and value types.
sequenceFile(String, Class<K>, Class<V>) - Method in class org.apache.spark.SparkContext: Get an RDD for a Hadoop SequenceFile with given key and value types.
sequenceFile(String, int, ClassTag<K>, ClassTag<V>, Function0<WritableConverter<K>>, Function0<WritableConverter<V>>) - Method in class org.apache.spark.SparkContext: Version of sequenceFile() for types implicitly convertible to Writables through a WritableConverter.
SequenceFileRDDFunctions<K,V> - Class in org.apache.spark.rdd: Extra functions available on RDDs of (key, value) pairs to create a Hadoop SequenceFile, through an implicit conversion.
SequenceFileRDDFunctions(RDD<Tuple2<K, V>>, Class<? extends Writable>, Class<? extends Writable>, Function1<K, Writable>, ClassTag<K>, Function1<V, Writable>, ClassTag<V>) - Constructor for class org.apache.spark.rdd.SequenceFileRDDFunctions
SequenceFileRDDFunctions(RDD<Tuple2<K, V>>, Function1<K, Writable>, ClassTag<K>, Function1<V, Writable>, ClassTag<V>) - Constructor for class org.apache.spark.rdd.SequenceFileRDDFunctions
SerializableWritable<T extends org.apache.hadoop.io.Writable> - Class in org.apache.spark
SerializableWritable(T) - Constructor for class org.apache.spark.SerializableWritable
SerializationStream - Class in org.apache.spark.serializer: :: DeveloperApi :: A stream for writing serialized objects.
SerializationStream() - Constructor for class org.apache.spark.serializer.SerializationStream
serialize(Object) - Method in class org.apache.spark.mllib.linalg.VectorUDT
serialize(T, ClassTag<T>) - Method in class org.apache.spark.serializer.DummySerializerInstance
serialize(T, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializerInstance
serialize(Object) - Method in class org.apache.spark.sql.types.UserDefinedType: Convert the user type to a SQL datum
serializedData() - Method in class org.apache.spark.scheduler.local.StatusUpdate
serializedPyClass() - Method in class org.apache.spark.sql.types.UserDefinedType: Serialized Python UDT class, if exists.
Serializer - Class in org.apache.spark.serializer: :: DeveloperApi :: A serializer.
Serializer() - Constructor for class org.apache.spark.serializer.Serializer
serializer() - Method in class org.apache.spark.ShuffleDependency
serializer() - Method in class org.apache.spark.SparkEnv
SerializerInstance - Class in org.apache.spark.serializer: :: DeveloperApi :: An instance of a serializer, for use by one thread at a time.
SerializerInstance() - Constructor for class org.apache.spark.serializer.SerializerInstance
serializeStream(OutputStream) - Method in class org.apache.spark.serializer.DummySerializerInstance
serializeStream(OutputStream) - Method in class org.apache.spark.serializer.SerializerInstance
set(Edge<ED>) - Method in class org.apache.spark.graphx.EdgeTriplet: Set the edge properties of this triplet.
set(long, long, int, int, VD, VD, ED) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
set(Param<T>, T) - Method in interface org.apache.spark.ml.param.Params
set(String, Object) - Method in interface org.apache.spark.ml.param.Params
set(ParamPair<?>) - Method in interface org.apache.spark.ml.param.Params
set(String, String) - Method in class org.apache.spark.SparkConf: Set a configuration variable.
set(SparkEnv) - Static method in class org.apache.spark.SparkEnv
set(long) - Method in class org.apache.spark.sql.types.Decimal: Set this Decimal to the given Long.
set(int) - Method in class org.apache.spark.sql.types.Decimal: Set this Decimal to the given Int.
set(long, int, int) - Method in class org.apache.spark.sql.types.Decimal: Set this Decimal to the given unscaled Long, with a given precision and scale.
set(BigDecimal, int, int) - Method in class org.apache.spark.sql.types.Decimal: Set this Decimal to the given BigDecimal value, with a given precision and scale.
set(BigDecimal) - Method in class org.apache.spark.sql.types.Decimal: Set this Decimal to the given BigDecimal value, inheriting its precision and scale.
set(Decimal) - Method in class org.apache.spark.sql.types.Decimal: Set this Decimal to the given Decimal value.
setActive(SQLContext) - Static method in class org.apache.spark.sql.SQLContext: Changes the SQLContext that will be returned in this thread and its children when SQLContext.getOrCreate() is called.
setAggregator(Aggregator<K, V, C>) - Method in class org.apache.spark.rdd.ShuffledRDD: Set aggregator for RDD's shuffle.
setAlgo(String) - Method in class org.apache.spark.mllib.tree.configuration.Strategy: Sets Algorithm using a String.
setAll(Traversable<Tuple2<String, String>>) - Method in class org.apache.spark.SparkConf: Set multiple parameters together
setAlpha(double) - Method in class org.apache.spark.ml.recommendation.ALS
setAlpha(Vector) - Method in class org.apache.spark.mllib.clustering.LDA: Alias for setDocConcentration()
setAlpha(double) - Method in class org.apache.spark.mllib.clustering.LDA: Alias for setDocConcentration()
setAlpha(double) - Method in class org.apache.spark.mllib.recommendation.ALS
setAppName(String) - Method in class org.apache.spark.launcher.SparkLauncher: Set the application name.
setAppName(String) - Method in class org.apache.spark.SparkConf: Set a name for your application.
setAppResource(String) - Method in class org.apache.spark.launcher.SparkLauncher: Set the main application resource.
setBandwidth(double) - Method in class org.apache.spark.mllib.stat.KernelDensity: Sets the bandwidth (standard deviation) of the Gaussian kernel (default: 1.0).
setBeta(double) - Method in class org.apache.spark.mllib.clustering.LDA: Alias for setTopicConcentration()
setBlocks(int) - Method in class org.apache.spark.mllib.recommendation.ALS
setBlockSize(int) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier
setCacheNodeIds(boolean) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
setCacheNodeIds(boolean) - Method in class org.apache.spark.ml.classification.GBTClassifier
setCacheNodeIds(boolean) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
setCacheNodeIds(boolean) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
setCacheNodeIds(boolean) - Method in class org.apache.spark.ml.regression.GBTRegressor
setCacheNodeIds(boolean) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
setCallSite(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Pass-through to SparkContext.setCallSite.
setCallSite(String) - Method in class org.apache.spark.SparkContext: Set the thread-local property for overriding the call sites of actions and RDDs.
setCaseSensitive(boolean) - Method in class org.apache.spark.ml.feature.StopWordsRemover
setCategoricalFeaturesInfo(Map<Integer, Integer>) - Method in class org.apache.spark.mllib.tree.configuration.Strategy: Sets categoricalFeaturesInfo using a Java Map.
setCensorCol(String) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
setCheckpointDir(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Set the directory under which RDDs are going to be checkpointed.
setCheckpointDir(String) - Method in class org.apache.spark.SparkContext: Set the directory under which RDDs are going to be checkpointed.
setCheckpointInterval(int) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
setCheckpointInterval(int) - Method in class org.apache.spark.ml.classification.GBTClassifier
setCheckpointInterval(int) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
setCheckpointInterval(int) - Method in class org.apache.spark.ml.clustering.LDA
setCheckpointInterval(int) - Method in class org.apache.spark.ml.recommendation.ALS
setCheckpointInterval(int) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
setCheckpointInterval(int) - Method in class org.apache.spark.ml.regression.GBTRegressor
setCheckpointInterval(int) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
setCheckpointInterval(int) - Method in class org.apache.spark.mllib.clustering.LDA: Period (in iterations) between checkpoints (default = 10).
setCheckpointInterval(int) - Method in class org.apache.spark.mllib.recommendation.ALS
setCheckpointInterval(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
setClassifier(Classifier<?, ?, ?>) - Method in class org.apache.spark.ml.classification.OneVsRest
setConf(String, String) - Method in class org.apache.spark.launcher.SparkLauncher: Set a single configuration value for the application.
setConf(String, String) - Method in class org.apache.spark.sql.hive.HiveContext
setConf(Properties) - Method in class org.apache.spark.sql.SQLContext: Set Spark SQL configuration properties.
setConf(String, String) - Method in class org.apache.spark.sql.SQLContext: Set the given Spark SQL configuration property.
setConfig(String, String) - Static method in class org.apache.spark.launcher.SparkLauncher: Set a configuration value for the launcher library.
setConvergenceTol(double) - Method in class org.apache.spark.mllib.clustering.GaussianMixture: Set the largest change in log-likelihood at which convergence is considered to have occurred.
setConvergenceTol(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent: Set the convergence tolerance.
setConvergenceTol(double) - Method in class org.apache.spark.mllib.optimization.LBFGS: Set the convergence tolerance of iterations for L-BFGS.
setConvergenceTol(double) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD: Set the convergence tolerance.
setDecayFactor(double) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans: Set the decay factor directly (for forgetful algorithms).
setDefault(Param<T>, T) - Method in interface org.apache.spark.ml.param.Params: Sets a default value for a param.
setDefault(Seq<ParamPair<?>>) - Method in interface org.apache.spark.ml.param.Params: Sets default values for a list of params.
setDefaultClassLoader(ClassLoader) - Method in class org.apache.spark.serializer.Serializer: Sets a class loader for the serializer to use in deserialization.
setDegree(int) - Method in class org.apache.spark.ml.feature.PolynomialExpansion
setDeployMode(String) - Method in class org.apache.spark.launcher.SparkLauncher: Set the deploy mode for the application.
setDocConcentration(double[]) - Method in class org.apache.spark.ml.clustering.LDA
setDocConcentration(double) - Method in class org.apache.spark.ml.clustering.LDA
setDocConcentration(Vector) - Method in class org.apache.spark.mllib.clustering.LDA: Concentration parameter (commonly named "alpha") for the prior placed on documents' distributions over topics ("theta").
setDocConcentration(double) - Method in class org.apache.spark.mllib.clustering.LDA: Replicates a Double docConcentration to create a symmetric prior.
setDropLast(boolean) - Method in class org.apache.spark.ml.feature.OneHotEncoder
setElasticNetParam(double) - Method in class org.apache.spark.ml.classification.LogisticRegression: Set the ElasticNet mixing parameter.
setElasticNetParam(double) - Method in class org.apache.spark.ml.regression.LinearRegression: Set the ElasticNet mixing parameter.
setEpsilon(double) - Method in class org.apache.spark.mllib.clustering.KMeans: Set the distance threshold within which we've consider centers to have converged.
setEstimator(Estimator<?>) - Method in class org.apache.spark.ml.tuning.CrossValidator
setEstimator(Estimator<?>) - Method in class org.apache.spark.ml.tuning.TrainValidationSplit
setEstimatorParamMaps(ParamMap[]) - Method in class org.apache.spark.ml.tuning.CrossValidator
setEstimatorParamMaps(ParamMap[]) - Method in class org.apache.spark.ml.tuning.TrainValidationSplit
setEvaluator(Evaluator) - Method in class org.apache.spark.ml.tuning.CrossValidator
setEvaluator(Evaluator) - Method in class org.apache.spark.ml.tuning.TrainValidationSplit
setExecutorEnv(String, String) - Method in class org.apache.spark.SparkConf: Set an environment variable to be used when launching executors for this application.
setExecutorEnv(Seq<Tuple2<String, String>>) - Method in class org.apache.spark.SparkConf: Set multiple environment variables to be used when launching executors.
setExecutorEnv(Tuple2<String, String>[]) - Method in class org.apache.spark.SparkConf: Set multiple environment variables to be used when launching executors.
setFeatureIndex(int) - Method in class org.apache.spark.ml.regression.IsotonicRegression
setFeatureIndex(int) - Method in class org.apache.spark.ml.regression.IsotonicRegressionModel
setFeaturesCol(String) - Method in class org.apache.spark.ml.classification.OneVsRest
setFeaturesCol(String) - Method in class org.apache.spark.ml.clustering.KMeans
setFeaturesCol(String) - Method in class org.apache.spark.ml.clustering.LDA: The features for LDA should be a Vector representing the word counts in a document.
setFeaturesCol(String) - Method in class org.apache.spark.ml.clustering.LDAModel: The features for LDA should be a Vector representing the word counts in a document.
setFeaturesCol(String) - Method in class org.apache.spark.ml.feature.ChiSqSelector
setFeaturesCol(String) - Method in class org.apache.spark.ml.feature.ChiSqSelectorModel
setFeaturesCol(String) - Method in class org.apache.spark.ml.feature.RFormula
setFeaturesCol(String) - Method in class org.apache.spark.ml.PredictionModel
setFeaturesCol(String) - Method in class org.apache.spark.ml.Predictor
setFeaturesCol(String) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
setFeaturesCol(String) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
setFeaturesCol(String) - Method in class org.apache.spark.ml.regression.IsotonicRegression
setFeaturesCol(String) - Method in class org.apache.spark.ml.regression.IsotonicRegressionModel
setFeatureSubsetStrategy(String) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
setFeatureSubsetStrategy(String) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
setFinalRDDStorageLevel(StorageLevel) - Method in class org.apache.spark.mllib.recommendation.ALS
setFitIntercept(boolean) - Method in class org.apache.spark.ml.classification.LogisticRegression: Whether to fit an intercept term.
setFitIntercept(boolean) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression: Set if we should fit the intercept Default is true.
setFitIntercept(boolean) - Method in class org.apache.spark.ml.regression.LinearRegression: Set if we should fit the intercept Default is true.
setFormula(String) - Method in class org.apache.spark.ml.feature.RFormula: Sets the formula to use for this transformer.
setGaps(boolean) - Method in class org.apache.spark.ml.feature.RegexTokenizer
setGradient(Gradient) - Method in class org.apache.spark.mllib.optimization.GradientDescent: Set the gradient function (of the loss function of one single data example) to be used for SGD.
setGradient(Gradient) - Method in class org.apache.spark.mllib.optimization.LBFGS: Set the gradient function (of the loss function of one single data example) to be used for L-BFGS.
setHalfLife(double, String) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans: Set the half life and time unit ("batches" or "points") for forgetful algorithms.
setHandleInvalid(String) - Method in class org.apache.spark.ml.feature.StringIndexer
setHandleInvalid(String) - Method in class org.apache.spark.ml.feature.StringIndexerModel
setIfMissing(String, String) - Method in class org.apache.spark.SparkConf: Set a parameter if it isn't already configured
setImplicitPrefs(boolean) - Method in class org.apache.spark.ml.recommendation.ALS
setImplicitPrefs(boolean) - Method in class org.apache.spark.mllib.recommendation.ALS
setImpurity(String) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
setImpurity(String) - Method in class org.apache.spark.ml.classification.GBTClassifier: The impurity setting is ignored for GBT models.
setImpurity(String) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
setImpurity(String) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
setImpurity(String) - Method in class org.apache.spark.ml.regression.GBTRegressor: The impurity setting is ignored for GBT models.
setImpurity(String) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
setImpurity(Impurity) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
setIndices(int[]) - Method in class org.apache.spark.ml.feature.VectorSlicer
setInitialCenters(Vector[], double[]) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans: Specify initial centers directly.
setInitializationMode(String) - Method in class org.apache.spark.mllib.clustering.KMeans: Set the initialization algorithm.
setInitializationMode(String) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering: Set the initialization mode.
setInitializationSteps(int) - Method in class org.apache.spark.mllib.clustering.KMeans: Set the number of steps for the k-means|| initialization mode.
setInitialModel(GaussianMixtureModel) - Method in class org.apache.spark.mllib.clustering.GaussianMixture: Set the initial GMM starting point, bypassing the random initialization.
setInitialModel(KMeansModel) - Method in class org.apache.spark.mllib.clustering.KMeans: Set the initial starting point, bypassing the random initialization or k-means|| The condition model.k == this.k must be met, failure results in an IllegalArgumentException.
setInitialWeights(Vector) - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD: Set the initial weights.
setInitialWeights(Vector) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD: Set the initial weights.
setInitMode(String) - Method in class org.apache.spark.ml.clustering.KMeans
setInitSteps(int) - Method in class org.apache.spark.ml.clustering.KMeans
setInputCol(String) - Method in class org.apache.spark.ml.feature.Binarizer
setInputCol(String) - Method in class org.apache.spark.ml.feature.Bucketizer
setInputCol(String) - Method in class org.apache.spark.ml.feature.CountVectorizer
setInputCol(String) - Method in class org.apache.spark.ml.feature.CountVectorizerModel
setInputCol(String) - Method in class org.apache.spark.ml.feature.HashingTF
setInputCol(String) - Method in class org.apache.spark.ml.feature.IDF
setInputCol(String) - Method in class org.apache.spark.ml.feature.IDFModel
setInputCol(String) - Method in class org.apache.spark.ml.feature.IndexToString
setInputCol(String) - Method in class org.apache.spark.ml.feature.MinMaxScaler
setInputCol(String) - Method in class org.apache.spark.ml.feature.MinMaxScalerModel
setInputCol(String) - Method in class org.apache.spark.ml.feature.OneHotEncoder
setInputCol(String) - Method in class org.apache.spark.ml.feature.PCA
setInputCol(String) - Method in class org.apache.spark.ml.feature.PCAModel
setInputCol(String) - Method in class org.apache.spark.ml.feature.QuantileDiscretizer
setInputCol(String) - Method in class org.apache.spark.ml.feature.StandardScaler
setInputCol(String) - Method in class org.apache.spark.ml.feature.StandardScalerModel
setInputCol(String) - Method in class org.apache.spark.ml.feature.StopWordsRemover
setInputCol(String) - Method in class org.apache.spark.ml.feature.StringIndexer
setInputCol(String) - Method in class org.apache.spark.ml.feature.StringIndexerModel
setInputCol(String) - Method in class org.apache.spark.ml.feature.VectorIndexer
setInputCol(String) - Method in class org.apache.spark.ml.feature.VectorIndexerModel
setInputCol(String) - Method in class org.apache.spark.ml.feature.VectorSlicer
setInputCol(String) - Method in class org.apache.spark.ml.feature.Word2Vec
setInputCol(String) - Method in class org.apache.spark.ml.feature.Word2VecModel
setInputCol(String) - Method in class org.apache.spark.ml.UnaryTransformer
setInputCols(String[]) - Method in class org.apache.spark.ml.feature.Interaction
setInputCols(String[]) - Method in class org.apache.spark.ml.feature.VectorAssembler
setIntercept(boolean) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm: Set if the algorithm should add an intercept.
setIntermediateRDDStorageLevel(StorageLevel) - Method in class org.apache.spark.mllib.recommendation.ALS
setInverse(boolean) - Method in class org.apache.spark.ml.feature.DCT
setIsotonic(boolean) - Method in class org.apache.spark.ml.regression.IsotonicRegression
setIsotonic(boolean) - Method in class org.apache.spark.mllib.regression.IsotonicRegression
setItemCol(String) - Method in class org.apache.spark.ml.recommendation.ALS
setItemCol(String) - Method in class org.apache.spark.ml.recommendation.ALSModel
setIterations(int) - Method in class org.apache.spark.mllib.recommendation.ALS
setJars(Seq<String>) - Method in class org.apache.spark.SparkConf: Set JAR files to distribute to the cluster.
setJars(String[]) - Method in class org.apache.spark.SparkConf: Set JAR files to distribute to the cluster.
setJavaHome(String) - Method in class org.apache.spark.launcher.SparkLauncher: Set a custom JAVA_HOME for launching the Spark application.
setJobDescription(String) - Method in class org.apache.spark.SparkContext: Set a human readable description of the current job.
setJobGroup(String, String, boolean) - Method in class org.apache.spark.api.java.JavaSparkContext: Assigns a group ID to all the jobs started by this thread until the group ID is set to a different value or cleared.
setJobGroup(String, String) - Method in class org.apache.spark.api.java.JavaSparkContext: Assigns a group ID to all the jobs started by this thread until the group ID is set to a different value or cleared.
setJobGroup(String, String, boolean) - Method in class org.apache.spark.SparkContext: Assigns a group ID to all the jobs started by this thread until the group ID is set to a different value or cleared.
setK(int) - Method in class org.apache.spark.ml.clustering.KMeans
setK(int) - Method in class org.apache.spark.ml.clustering.LDA
setK(int) - Method in class org.apache.spark.ml.feature.PCA
setK(int) - Method in class org.apache.spark.mllib.clustering.BisectingKMeans: Sets the desired number of leaf clusters (default: 4).
setK(int) - Method in class org.apache.spark.mllib.clustering.GaussianMixture: Set the number of Gaussians in the mixture model.
setK(int) - Method in class org.apache.spark.mllib.clustering.KMeans: Set the number of clusters to create (k).
setK(int) - Method in class org.apache.spark.mllib.clustering.LDA: Number of topics to infer.
setK(int) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering
setK(int) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans: Set the number of clusters.
setKappa(double) - Method in class org.apache.spark.mllib.clustering.OnlineLDAOptimizer: Learning rate: exponential decay rate---should be between (0.5, 1.0] to guarantee asymptotic convergence.
setKeyOrdering(Ordering<K>) - Method in class org.apache.spark.rdd.ShuffledRDD: Set key ordering for RDD's shuffle.
setLabelCol(String) - Method in class org.apache.spark.ml.classification.OneVsRest
setLabelCol(String) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
setLabelCol(String) - Method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
setLabelCol(String) - Method in class org.apache.spark.ml.evaluation.RegressionEvaluator
setLabelCol(String) - Method in class org.apache.spark.ml.feature.ChiSqSelector
setLabelCol(String) - Method in class org.apache.spark.ml.feature.ChiSqSelectorModel
setLabelCol(String) - Method in class org.apache.spark.ml.feature.RFormula
setLabelCol(String) - Method in class org.apache.spark.ml.Predictor
setLabelCol(String) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
setLabelCol(String) - Method in class org.apache.spark.ml.regression.IsotonicRegression
setLabels(String[]) - Method in class org.apache.spark.ml.feature.IndexToString
setLambda(double) - Method in class org.apache.spark.mllib.classification.NaiveBayes
setLambda(double) - Method in class org.apache.spark.mllib.recommendation.ALS
setLayers(int[]) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier
setLearningDecay(double) - Method in class org.apache.spark.ml.clustering.LDA
setLearningOffset(double) - Method in class org.apache.spark.ml.clustering.LDA
setLearningRate(double) - Method in class org.apache.spark.mllib.feature.Word2Vec
setLearningRate(double) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
setLocalProperty(String, String) - Method in class org.apache.spark.api.java.JavaSparkContext: Set a local property that affects jobs submitted from this thread, such as the Spark fair scheduler pool.
setLocalProperty(String, String) - Method in class org.apache.spark.SparkContext: Set a local property that affects jobs submitted from this thread, such as the Spark fair scheduler pool.
setLogLevel(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Control our logLevel.
setLogLevel(String) - Method in class org.apache.spark.SparkContext: Control our logLevel.
setLoss(Loss) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
setLossType(String) - Method in class org.apache.spark.ml.classification.GBTClassifier
setLossType(String) - Method in class org.apache.spark.ml.regression.GBTRegressor
setMainClass(String) - Method in class org.apache.spark.launcher.SparkLauncher: Sets the application class name for Java/Scala applications.
setMapSideCombine(boolean) - Method in class org.apache.spark.rdd.ShuffledRDD: Set mapSideCombine flag for RDD's shuffle.
setMaster(String) - Method in class org.apache.spark.launcher.SparkLauncher: Set the Spark master for the application.
setMaster(String) - Method in class org.apache.spark.SparkConf: The master URL to connect to, such as "local" to run locally with one thread, "local[4]" to run locally with 4 cores, or "spark://master:7077" to run on a Spark standalone cluster.
setMax(double) - Method in class org.apache.spark.ml.feature.MinMaxScaler
setMax(double) - Method in class org.apache.spark.ml.feature.MinMaxScalerModel
setMaxBins(int) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
setMaxBins(int) - Method in class org.apache.spark.ml.classification.GBTClassifier
setMaxBins(int) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
setMaxBins(int) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
setMaxBins(int) - Method in class org.apache.spark.ml.regression.GBTRegressor
setMaxBins(int) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
setMaxBins(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
setMaxCategories(int) - Method in class org.apache.spark.ml.feature.VectorIndexer
setMaxDepth(int) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
setMaxDepth(int) - Method in class org.apache.spark.ml.classification.GBTClassifier
setMaxDepth(int) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
setMaxDepth(int) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
setMaxDepth(int) - Method in class org.apache.spark.ml.regression.GBTRegressor
setMaxDepth(int) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
setMaxDepth(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
setMaxIter(int) - Method in class org.apache.spark.ml.classification.GBTClassifier
setMaxIter(int) - Method in class org.apache.spark.ml.classification.LogisticRegression: Set the maximum number of iterations.
setMaxIter(int) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier: Set the maximum number of iterations.
setMaxIter(int) - Method in class org.apache.spark.ml.clustering.KMeans
setMaxIter(int) - Method in class org.apache.spark.ml.clustering.LDA
setMaxIter(int) - Method in class org.apache.spark.ml.feature.Word2Vec
setMaxIter(int) - Method in class org.apache.spark.ml.recommendation.ALS
setMaxIter(int) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression: Set the maximum number of iterations.
setMaxIter(int) - Method in class org.apache.spark.ml.regression.GBTRegressor
setMaxIter(int) - Method in class org.apache.spark.ml.regression.LinearRegression: Set the maximum number of iterations.
setMaxIterations(int) - Method in class org.apache.spark.mllib.clustering.BisectingKMeans: Sets the max number of k-means iterations to split clusters (default: 20).
setMaxIterations(int) - Method in class org.apache.spark.mllib.clustering.GaussianMixture: Set the maximum number of iterations to run.
setMaxIterations(int) - Method in class org.apache.spark.mllib.clustering.KMeans: Set maximum number of iterations to run.
setMaxIterations(int) - Method in class org.apache.spark.mllib.clustering.LDA: Maximum number of iterations for learning.
setMaxIterations(int) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering
setMaxLocalProjDBSize(long) - Method in class org.apache.spark.mllib.fpm.PrefixSpan: Sets the maximum number of items (including delimiters used in the internal storage format) allowed in a projected database before local processing (default: 32000000L).
setMaxMemoryInMB(int) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
setMaxMemoryInMB(int) - Method in class org.apache.spark.ml.classification.GBTClassifier
setMaxMemoryInMB(int) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
setMaxMemoryInMB(int) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
setMaxMemoryInMB(int) - Method in class org.apache.spark.ml.regression.GBTRegressor
setMaxMemoryInMB(int) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
setMaxMemoryInMB(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
setMaxNumIterations(int) - Method in class org.apache.spark.mllib.optimization.LBFGS: Deprecated.
use LBFGS.setNumIterations(int) instead
setMaxPatternLength(int) - Method in class org.apache.spark.mllib.fpm.PrefixSpan: Sets maximal pattern length (default: 10).
setMetricName(String) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
setMetricName(String) - Method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
setMetricName(String) - Method in class org.apache.spark.ml.evaluation.RegressionEvaluator
setMin(double) - Method in class org.apache.spark.ml.feature.MinMaxScaler
setMin(double) - Method in class org.apache.spark.ml.feature.MinMaxScalerModel
setMinConfidence(double) - Method in class org.apache.spark.mllib.fpm.AssociationRules: Sets the minimal confidence (default: 0.8).
setMinCount(int) - Method in class org.apache.spark.ml.feature.Word2Vec
setMinCount(int) - Method in class org.apache.spark.mllib.feature.Word2Vec
setMinDF(double) - Method in class org.apache.spark.ml.feature.CountVectorizer
setMinDivisibleClusterSize(double) - Method in class org.apache.spark.mllib.clustering.BisectingKMeans: Sets the minimum number of points (if >= 1.0) or the minimum proportion of points (if < 1.0) of a divisible cluster (default: 1).
setMinDocFreq(int) - Method in class org.apache.spark.ml.feature.IDF
setMiniBatchFraction(double) - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD: Set the fraction of each batch to use for updates.
setMiniBatchFraction(double) - Method in class org.apache.spark.mllib.clustering.OnlineLDAOptimizer: Mini-batch fraction in (0, 1], which sets the fraction of document sampled and used in each iteration.
setMiniBatchFraction(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent: :: Experimental :: Set fraction of data to be used for each SGD iteration.
setMiniBatchFraction(double) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD: Set the fraction of each batch to use for updates.
setMinInfoGain(double) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
setMinInfoGain(double) - Method in class org.apache.spark.ml.classification.GBTClassifier
setMinInfoGain(double) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
setMinInfoGain(double) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
setMinInfoGain(double) - Method in class org.apache.spark.ml.regression.GBTRegressor
setMinInfoGain(double) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
setMinInfoGain(double) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
setMinInstancesPerNode(int) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
setMinInstancesPerNode(int) - Method in class org.apache.spark.ml.classification.GBTClassifier
setMinInstancesPerNode(int) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
setMinInstancesPerNode(int) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
setMinInstancesPerNode(int) - Method in class org.apache.spark.ml.regression.GBTRegressor
setMinInstancesPerNode(int) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
setMinInstancesPerNode(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
setMinSupport(double) - Method in class org.apache.spark.mllib.fpm.FPGrowth: Sets the minimal support level (default: 0.3).
setMinSupport(double) - Method in class org.apache.spark.mllib.fpm.PrefixSpan: Sets the minimal support level (default: 0.1).
setMinTF(double) - Method in class org.apache.spark.ml.feature.CountVectorizer
setMinTF(double) - Method in class org.apache.spark.ml.feature.CountVectorizerModel
setMinTokenLength(int) - Method in class org.apache.spark.ml.feature.RegexTokenizer
setModelType(String) - Method in class org.apache.spark.ml.classification.NaiveBayes: Set the model type using a string (case-sensitive).
setModelType(String) - Method in class org.apache.spark.mllib.classification.NaiveBayes
setN(int) - Method in class org.apache.spark.ml.feature.NGram
setName(String) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Assign a name to this RDD
setName(String) - Method in class org.apache.spark.api.java.JavaPairRDD: Assign a name to this RDD
setName(String) - Method in class org.apache.spark.api.java.JavaRDD: Assign a name to this RDD
setName(String) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
setName(String) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
setName(String) - Method in class org.apache.spark.rdd.RDD: Assign a name to this RDD
setNames(String[]) - Method in class org.apache.spark.ml.feature.VectorSlicer
setNonnegative(boolean) - Method in class org.apache.spark.ml.recommendation.ALS
setNonnegative(boolean) - Method in class org.apache.spark.mllib.recommendation.ALS
setNumBlocks(int) - Method in class org.apache.spark.ml.recommendation.ALS: Sets both numUserBlocks and numItemBlocks to the specific value.
setNumBuckets(int) - Method in class org.apache.spark.ml.feature.QuantileDiscretizer
setNumClasses(int) - Method in class org.apache.spark.mllib.classification.LogisticRegressionWithLBFGS: Set the number of possible outcomes for k classes classification problem in Multinomial Logistic Regression.
setNumClasses(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
setNumCorrections(int) - Method in class org.apache.spark.mllib.optimization.LBFGS: Set the number of corrections used in the LBFGS update.
setNumFeatures(int) - Method in class org.apache.spark.ml.feature.HashingTF
setNumFolds(int) - Method in class org.apache.spark.ml.tuning.CrossValidator
setNumItemBlocks(int) - Method in class org.apache.spark.ml.recommendation.ALS
setNumIterations(int) - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD: Set the number of iterations of gradient descent to run per update.
setNumIterations(int) - Method in class org.apache.spark.mllib.feature.Word2Vec
setNumIterations(int) - Method in class org.apache.spark.mllib.optimization.GradientDescent: Set the number of iterations for SGD.
setNumIterations(int) - Method in class org.apache.spark.mllib.optimization.LBFGS: Set the maximal number of iterations for L-BFGS.
setNumIterations(int) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD: Set the number of iterations of gradient descent to run per update.
setNumIterations(int) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
setNumPartitions(int) - Method in class org.apache.spark.ml.feature.Word2Vec
setNumPartitions(int) - Method in class org.apache.spark.mllib.feature.Word2Vec
setNumPartitions(int) - Method in class org.apache.spark.mllib.fpm.FPGrowth: Sets the number of partitions used by parallel FP-growth (default: same as input data).
setNumTopFeatures(int) - Method in class org.apache.spark.ml.feature.ChiSqSelector
setNumTrees(int) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
setNumTrees(int) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
setNumUserBlocks(int) - Method in class org.apache.spark.ml.recommendation.ALS
setOptimizeDocConcentration(boolean) - Method in class org.apache.spark.ml.clustering.LDA
setOptimizeDocConcentration(boolean) - Method in class org.apache.spark.mllib.clustering.OnlineLDAOptimizer: Sets whether to optimize docConcentration parameter during training.
setOptimizer(String) - Method in class org.apache.spark.ml.clustering.LDA
setOptimizer(LDAOptimizer) - Method in class org.apache.spark.mllib.clustering.LDA: :: DeveloperApi ::
setOptimizer(String) - Method in class org.apache.spark.mllib.clustering.LDA: Set the LDAOptimizer used to perform the actual calculation by algorithm name.
setOrNull(long, int, int) - Method in class org.apache.spark.sql.types.Decimal: Set this Decimal to the given unscaled Long, with a given precision and scale, and return it, or return null if it cannot be set due to overflow.
setOutputCol(String) - Method in class org.apache.spark.ml.feature.Binarizer
setOutputCol(String) - Method in class org.apache.spark.ml.feature.Bucketizer
setOutputCol(String) - Method in class org.apache.spark.ml.feature.ChiSqSelector
setOutputCol(String) - Method in class org.apache.spark.ml.feature.ChiSqSelectorModel
setOutputCol(String) - Method in class org.apache.spark.ml.feature.CountVectorizer
setOutputCol(String) - Method in class org.apache.spark.ml.feature.CountVectorizerModel
setOutputCol(String) - Method in class org.apache.spark.ml.feature.HashingTF
setOutputCol(String) - Method in class org.apache.spark.ml.feature.IDF
setOutputCol(String) - Method in class org.apache.spark.ml.feature.IDFModel
setOutputCol(String) - Method in class org.apache.spark.ml.feature.IndexToString
setOutputCol(String) - Method in class org.apache.spark.ml.feature.Interaction
setOutputCol(String) - Method in class org.apache.spark.ml.feature.MinMaxScaler
setOutputCol(String) - Method in class org.apache.spark.ml.feature.MinMaxScalerModel
setOutputCol(String) - Method in class org.apache.spark.ml.feature.OneHotEncoder
setOutputCol(String) - Method in class org.apache.spark.ml.feature.PCA
setOutputCol(String) - Method in class org.apache.spark.ml.feature.PCAModel
setOutputCol(String) - Method in class org.apache.spark.ml.feature.QuantileDiscretizer
setOutputCol(String) - Method in class org.apache.spark.ml.feature.StandardScaler
setOutputCol(String) - Method in class org.apache.spark.ml.feature.StandardScalerModel
setOutputCol(String) - Method in class org.apache.spark.ml.feature.StopWordsRemover
setOutputCol(String) - Method in class org.apache.spark.ml.feature.StringIndexer
setOutputCol(String) - Method in class org.apache.spark.ml.feature.StringIndexerModel
setOutputCol(String) - Method in class org.apache.spark.ml.feature.VectorAssembler
setOutputCol(String) - Method in class org.apache.spark.ml.feature.VectorIndexer
setOutputCol(String) - Method in class org.apache.spark.ml.feature.VectorIndexerModel
setOutputCol(String) - Method in class org.apache.spark.ml.feature.VectorSlicer
setOutputCol(String) - Method in class org.apache.spark.ml.feature.Word2Vec
setOutputCol(String) - Method in class org.apache.spark.ml.feature.Word2VecModel
setOutputCol(String) - Method in class org.apache.spark.ml.UnaryTransformer
setP(double) - Method in class org.apache.spark.ml.feature.Normalizer
setParent(Estimator<M>) - Method in class org.apache.spark.ml.Model: Sets the parent of this model (Java API).
setPattern(String) - Method in class org.apache.spark.ml.feature.RegexTokenizer
setPeacePeriod(int) - Method in class org.apache.spark.mllib.stat.test.StreamingTest
setPredictionCol(String) - Method in class org.apache.spark.ml.classification.OneVsRest
setPredictionCol(String) - Method in class org.apache.spark.ml.clustering.KMeans
setPredictionCol(String) - Method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
setPredictionCol(String) - Method in class org.apache.spark.ml.evaluation.RegressionEvaluator
setPredictionCol(String) - Method in class org.apache.spark.ml.PredictionModel
setPredictionCol(String) - Method in class org.apache.spark.ml.Predictor
setPredictionCol(String) - Method in class org.apache.spark.ml.recommendation.ALS
setPredictionCol(String) - Method in class org.apache.spark.ml.recommendation.ALSModel
setPredictionCol(String) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
setPredictionCol(String) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
setPredictionCol(String) - Method in class org.apache.spark.ml.regression.IsotonicRegression
setPredictionCol(String) - Method in class org.apache.spark.ml.regression.IsotonicRegressionModel
setProbabilityCol(String) - Method in class org.apache.spark.ml.classification.ProbabilisticClassificationModel
setProbabilityCol(String) - Method in class org.apache.spark.ml.classification.ProbabilisticClassifier
setProductBlocks(int) - Method in class org.apache.spark.mllib.recommendation.ALS
setPropertiesFile(String) - Method in class org.apache.spark.launcher.SparkLauncher: Set a custom properties file with Spark configuration for the application.
setQuantileCalculationStrategy(Enumeration.Value) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
setQuantileProbabilities(double[]) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
setQuantileProbabilities(double[]) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
setQuantilesCol(String) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
setQuantilesCol(String) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
setRandomCenters(int, double, long) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans: Initialize random centers, requiring only the number of dimensions.
setRank(int) - Method in class org.apache.spark.ml.recommendation.ALS
setRank(int) - Method in class org.apache.spark.mllib.recommendation.ALS
setRatingCol(String) - Method in class org.apache.spark.ml.recommendation.ALS
setRawPredictionCol(String) - Method in class org.apache.spark.ml.classification.ClassificationModel
setRawPredictionCol(String) - Method in class org.apache.spark.ml.classification.Classifier
setRawPredictionCol(String) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
setRegParam(double) - Method in class org.apache.spark.ml.classification.LogisticRegression: Set the regularization parameter.
setRegParam(double) - Method in class org.apache.spark.ml.recommendation.ALS
setRegParam(double) - Method in class org.apache.spark.ml.regression.LinearRegression: Set the regularization parameter.
setRegParam(double) - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD: Set the regularization parameter.
setRegParam(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent: Set the regularization parameter.
setRegParam(double) - Method in class org.apache.spark.mllib.optimization.LBFGS: Set the regularization parameter.
setRest(long, int, VD, ED) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
setRuns(int) - Method in class org.apache.spark.mllib.clustering.KMeans: :: Experimental :: Set the number of runs of the algorithm to execute in parallel.
setSample(RDD<Object>) - Method in class org.apache.spark.mllib.stat.KernelDensity: Sets the sample to use for density estimation.
setSample(JavaRDD<Double>) - Method in class org.apache.spark.mllib.stat.KernelDensity: Sets the sample to use for density estimation (for Java users).
setScalingVec(Vector) - Method in class org.apache.spark.ml.feature.ElementwiseProduct
setScoreCol(String) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator: Deprecated.
use setRawPredictionCol() instead
setSeed(long) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
setSeed(long) - Method in class org.apache.spark.ml.classification.GBTClassifier
setSeed(long) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier: Set the seed for weights initialization.
setSeed(long) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
setSeed(long) - Method in class org.apache.spark.ml.clustering.KMeans
setSeed(long) - Method in class org.apache.spark.ml.clustering.LDA
setSeed(long) - Method in class org.apache.spark.ml.clustering.LDAModel
setSeed(long) - Method in class org.apache.spark.ml.feature.Word2Vec
setSeed(long) - Method in class org.apache.spark.ml.recommendation.ALS
setSeed(long) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
setSeed(long) - Method in class org.apache.spark.ml.regression.GBTRegressor
setSeed(long) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
setSeed(long) - Method in class org.apache.spark.mllib.clustering.BisectingKMeans: Sets the random seed (default: hash value of the class name).
setSeed(long) - Method in class org.apache.spark.mllib.clustering.GaussianMixture: Set the random seed
setSeed(long) - Method in class org.apache.spark.mllib.clustering.KMeans: Set the random seed for cluster initialization.
setSeed(long) - Method in class org.apache.spark.mllib.clustering.LDA: Random seed
setSeed(long) - Method in class org.apache.spark.mllib.feature.Word2Vec
setSeed(long) - Method in class org.apache.spark.mllib.random.ExponentialGenerator
setSeed(long) - Method in class org.apache.spark.mllib.random.GammaGenerator
setSeed(long) - Method in class org.apache.spark.mllib.random.LogNormalGenerator
setSeed(long) - Method in class org.apache.spark.mllib.random.PoissonGenerator
setSeed(long) - Method in class org.apache.spark.mllib.random.StandardNormalGenerator
setSeed(long) - Method in class org.apache.spark.mllib.random.UniformGenerator
setSeed(long) - Method in class org.apache.spark.mllib.random.WeibullGenerator
setSeed(long) - Method in class org.apache.spark.mllib.recommendation.ALS
setSeed(long) - Method in class org.apache.spark.util.random.BernoulliCellSampler
setSeed(long) - Method in class org.apache.spark.util.random.BernoulliSampler
setSeed(long) - Method in class org.apache.spark.util.random.PoissonSampler
setSeed(long) - Method in interface org.apache.spark.util.random.Pseudorandom: Set random seed.
setSerializer(Serializer) - Method in class org.apache.spark.rdd.CoGroupedRDD: Set a serializer for this RDD's shuffle, or null to use the default (spark.serializer)
setSerializer(Serializer) - Method in class org.apache.spark.rdd.ShuffledRDD: Set a serializer for this RDD's shuffle, or null to use the default (spark.serializer)
setSmoothing(double) - Method in class org.apache.spark.ml.classification.NaiveBayes: Set the smoothing parameter.
setSolver(String) - Method in class org.apache.spark.ml.regression.LinearRegression: Set the solver algorithm used for optimization.
setSparkHome(String) - Method in class org.apache.spark.launcher.SparkLauncher: Set a custom Spark installation location for the application.
setSparkHome(String) - Method in class org.apache.spark.SparkConf: Set the location where Spark is installed on worker nodes.
setSplits(double[]) - Method in class org.apache.spark.ml.feature.Bucketizer
setSrcOnly(long, int, VD) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
setStages(PipelineStage[]) - Method in class org.apache.spark.ml.Pipeline
setStandardization(boolean) - Method in class org.apache.spark.ml.classification.LogisticRegression: Whether to standardize the training features before fitting the model.
setStandardization(boolean) - Method in class org.apache.spark.ml.regression.LinearRegression: Whether to standardize the training features before fitting the model.
setStatement(String) - Method in class org.apache.spark.ml.feature.SQLTransformer
setStepSize(double) - Method in class org.apache.spark.ml.classification.GBTClassifier
setStepSize(double) - Method in class org.apache.spark.ml.feature.Word2Vec
setStepSize(double) - Method in class org.apache.spark.ml.regression.GBTRegressor
setStepSize(double) - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD: Set the step size for gradient descent.
setStepSize(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent: Set the initial step size of SGD for the first step.
setStepSize(double) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD: Set the step size for gradient descent.
setStopWords(String[]) - Method in class org.apache.spark.ml.feature.StopWordsRemover
setSubsamplingRate(double) - Method in class org.apache.spark.ml.classification.GBTClassifier
setSubsamplingRate(double) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
setSubsamplingRate(double) - Method in class org.apache.spark.ml.clustering.LDA
setSubsamplingRate(double) - Method in class org.apache.spark.ml.regression.GBTRegressor
setSubsamplingRate(double) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
setSubsamplingRate(double) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
setTaskContext(TaskContext) - Static method in class org.apache.spark.TaskContext: Set the thread local TaskContext.
setTau0(double) - Method in class org.apache.spark.mllib.clustering.OnlineLDAOptimizer: A (positive) learning parameter that downweights early iterations.
setTestMethod(String) - Method in class org.apache.spark.mllib.stat.test.StreamingTest
setThreshold(double) - Method in class org.apache.spark.ml.classification.LogisticRegression
setThreshold(double) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
setThreshold(double) - Method in class org.apache.spark.ml.feature.Binarizer
setThreshold(double) - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel: Sets the threshold that separates positive predictions from negative predictions in Binary Logistic Regression.
setThreshold(double) - Method in class org.apache.spark.mllib.classification.SVMModel: Sets the threshold that separates positive predictions from negative predictions.
setThresholds(double[]) - Method in class org.apache.spark.ml.classification.LogisticRegression
setThresholds(double[]) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
setThresholds(double[]) - Method in class org.apache.spark.ml.classification.ProbabilisticClassificationModel
setThresholds(double[]) - Method in class org.apache.spark.ml.classification.ProbabilisticClassifier
setTol(double) - Method in class org.apache.spark.ml.classification.LogisticRegression: Set the convergence tolerance of iterations.
setTol(double) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier: Set the convergence tolerance of iterations.
setTol(double) - Method in class org.apache.spark.ml.clustering.KMeans
setTol(double) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression: Set the convergence tolerance of iterations.
setTol(double) - Method in class org.apache.spark.ml.regression.LinearRegression: Set the convergence tolerance of iterations.
setToLowercase(boolean) - Method in class org.apache.spark.ml.feature.RegexTokenizer
setTopicConcentration(double) - Method in class org.apache.spark.ml.clustering.LDA
setTopicConcentration(double) - Method in class org.apache.spark.mllib.clustering.LDA: Concentration parameter (commonly named "beta" or "eta") for the prior placed on topics' distributions over terms.
setTopicDistributionCol(String) - Method in class org.apache.spark.ml.clustering.LDA
setTrainRatio(double) - Method in class org.apache.spark.ml.tuning.TrainValidationSplit
setTreeStrategy(Strategy) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
setUpdater(Updater) - Method in class org.apache.spark.mllib.optimization.GradientDescent: Set the updater function to actually perform a gradient step in a given direction.
setUpdater(Updater) - Method in class org.apache.spark.mllib.optimization.LBFGS: Set the updater function to actually perform a gradient step in a given direction.
setupGroups(int) - Method in class org.apache.spark.rdd.PartitionCoalescer: Initializes targetLen partition groups and assigns a preferredLocation This uses coupon collector to estimate how many preferredLocations it must rotate through until it has seen most of the preferred locations (2 * n log(n))
setUseNodeIdCache(boolean) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
setUserBlocks(int) - Method in class org.apache.spark.mllib.recommendation.ALS
setUserCol(String) - Method in class org.apache.spark.ml.recommendation.ALS
setUserCol(String) - Method in class org.apache.spark.ml.recommendation.ALSModel
setValidateData(boolean) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm: Set if the algorithm should validate data before training.
setValidationTol(double) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
setValue(R) - Method in class org.apache.spark.Accumulable: Set the accumulator's value; only allowed on master
setVectorSize(int) - Method in class org.apache.spark.ml.feature.Word2Vec
setVectorSize(int) - Method in class org.apache.spark.mllib.feature.Word2Vec
setVerbose(boolean) - Method in class org.apache.spark.launcher.SparkLauncher: Enables verbose reporting for SparkSubmit.
setVocabSize(int) - Method in class org.apache.spark.ml.feature.CountVectorizer
setWeightCol(String) - Method in class org.apache.spark.ml.classification.LogisticRegression: Whether to over-/under-sample training instances according to the given weights in weightCol.
setWeightCol(String) - Method in class org.apache.spark.ml.regression.IsotonicRegression
setWeightCol(String) - Method in class org.apache.spark.ml.regression.LinearRegression: Whether to over-/under-sample training instances according to the given weights in weightCol.
setWindowSize(int) - Method in class org.apache.spark.ml.feature.Word2Vec
setWindowSize(int) - Method in class org.apache.spark.mllib.feature.Word2Vec
setWindowSize(int) - Method in class org.apache.spark.mllib.stat.test.StreamingTest
setWithMean(boolean) - Method in class org.apache.spark.ml.feature.StandardScaler
setWithMean(boolean) - Method in class org.apache.spark.mllib.feature.StandardScalerModel
setWithStd(boolean) - Method in class org.apache.spark.ml.feature.StandardScaler
setWithStd(boolean) - Method in class org.apache.spark.mllib.feature.StandardScalerModel
sha1(Column) - Static method in class org.apache.spark.sql.functions: Calculates the SHA-1 digest of a binary column and returns the value as a 40 character hex string.
sha2(Column, int) - Static method in class org.apache.spark.sql.functions: Calculates the SHA-2 family of hash functions of a binary column and returns the value as a hex string.
shape() - Method in class org.apache.spark.mllib.random.GammaGenerator
shiftLeft(Column, int) - Static method in class org.apache.spark.sql.functions: Shift the the given value numBits left.
shiftRight(Column, int) - Static method in class org.apache.spark.sql.functions: Shift the the given value numBits right.
shiftRightUnsigned(Column, int) - Static method in class org.apache.spark.sql.functions: Unsigned shift the the given value numBits right.
SHORT() - Static method in class org.apache.spark.sql.Encoders: An encoder for nullable short type.
ShortDecimal() - Static method in class org.apache.spark.sql.types.DecimalType
ShortestPaths - Class in org.apache.spark.graphx.lib: Computes shortest paths to the given set of landmark vertices, returning a graph where each vertex attribute is a map containing the shortest-path distance to each reachable landmark.
ShortestPaths() - Constructor for class org.apache.spark.graphx.lib.ShortestPaths
shortName() - Method in class org.apache.spark.ml.source.libsvm.DefaultSource
shortName() - Method in interface org.apache.spark.sql.sources.DataSourceRegister: The string that represents the format that this data source provider uses.
ShortType - Static variable in class org.apache.spark.sql.types.DataTypes: Gets the ShortType object.
ShortType - Class in org.apache.spark.sql.types: :: DeveloperApi :: The data type representing Short values.
shouldDistributeGaussians(int, int) - Static method in class org.apache.spark.mllib.clustering.GaussianMixture: Heuristic to distribute the computation of the MultivariateGaussians, approximately when d > 25 except for when k is very small.
shouldGoLeft(Vector) - Method in interface org.apache.spark.ml.tree.Split: Return true (split to left) or false (split to right).
shouldGoLeft(int, Split[]) - Method in interface org.apache.spark.ml.tree.Split: Return true (split to left) or false (split to right).
shouldOverwrite() - Method in class org.apache.spark.ml.util.MLWriter
shouldOwn(Param<?>) - Method in interface org.apache.spark.ml.param.Params: Validates that the input param belongs to this instance.
show(int) - Method in class org.apache.spark.sql.DataFrame: Displays the DataFrame in a tabular form.
show() - Method in class org.apache.spark.sql.DataFrame: Displays the top 20 rows of DataFrame in a tabular form.
show(boolean) - Method in class org.apache.spark.sql.DataFrame: Displays the top 20 rows of DataFrame in a tabular form.
show(int, boolean) - Method in class org.apache.spark.sql.DataFrame: Displays the DataFrame in a tabular form.
show(int) - Method in class org.apache.spark.sql.Dataset: Displays the content of this Dataset in a tabular form.
show() - Method in class org.apache.spark.sql.Dataset: Displays the top 20 rows of Dataset in a tabular form.
show(boolean) - Method in class org.apache.spark.sql.Dataset: Displays the top 20 rows of Dataset in a tabular form.
show(int, boolean) - Method in class org.apache.spark.sql.Dataset: Displays the Dataset in a tabular form.
showBytesDistribution(String, Function2<TaskInfo, TaskMetrics, Option<Object>>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
showBytesDistribution(String, Option<org.apache.spark.util.Distribution>) - Static method in class org.apache.spark.scheduler.StatsReportListener
showBytesDistribution(String, org.apache.spark.util.Distribution) - Static method in class org.apache.spark.scheduler.StatsReportListener
showDistribution(String, org.apache.spark.util.Distribution, Function1<Object, String>) - Static method in class org.apache.spark.scheduler.StatsReportListener
showDistribution(String, Option<org.apache.spark.util.Distribution>, Function1<Object, String>) - Static method in class org.apache.spark.scheduler.StatsReportListener
showDistribution(String, Option<org.apache.spark.util.Distribution>, String) - Static method in class org.apache.spark.scheduler.StatsReportListener
showDistribution(String, String, Function2<TaskInfo, TaskMetrics, Option<Object>>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
showMillisDistribution(String, Option<org.apache.spark.util.Distribution>) - Static method in class org.apache.spark.scheduler.StatsReportListener
showMillisDistribution(String, Function2<TaskInfo, TaskMetrics, Option<Object>>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
showMillisDistribution(String, Function1<BatchInfo, Option<Object>>) - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
SHUFFLE() - Static method in class org.apache.spark.storage.BlockId
SHUFFLE_DATA() - Static method in class org.apache.spark.storage.BlockId
SHUFFLE_INDEX() - Static method in class org.apache.spark.storage.BlockId
ShuffleBlockId - Class in org.apache.spark.storage
ShuffleBlockId(int, int, int) - Constructor for class org.apache.spark.storage.ShuffleBlockId
ShuffleDataBlockId - Class in org.apache.spark.storage
ShuffleDataBlockId(int, int, int) - Constructor for class org.apache.spark.storage.ShuffleDataBlockId
ShuffleDependency<K,V,C> - Class in org.apache.spark: :: DeveloperApi :: Represents a dependency on the output of a shuffle stage.
ShuffleDependency(RDD<? extends Product2<K, V>>, Partitioner, Option<Serializer>, Option<Ordering<K>>, Option<Aggregator<K, V, C>>, boolean, ClassTag<K>, ClassTag<V>, ClassTag<C>) - Constructor for class org.apache.spark.ShuffleDependency
ShuffledRDD<K,V,C> - Class in org.apache.spark.rdd: :: DeveloperApi :: The resulting RDD from a shuffle (e.g.
ShuffledRDD(RDD<? extends Product2<K, V>>, Partitioner, ClassTag<K>, ClassTag<V>, ClassTag<C>) - Constructor for class org.apache.spark.rdd.ShuffledRDD
shuffleHandle() - Method in class org.apache.spark.ShuffleDependency
shuffleId() - Method in class org.apache.spark.CleanShuffle
shuffleId() - Method in class org.apache.spark.FetchFailed
shuffleId() - Method in class org.apache.spark.ShuffleDependency
shuffleId() - Method in class org.apache.spark.storage.ShuffleBlockId
shuffleId() - Method in class org.apache.spark.storage.ShuffleDataBlockId
shuffleId() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
ShuffleIndexBlockId - Class in org.apache.spark.storage
ShuffleIndexBlockId(int, int, int) - Constructor for class org.apache.spark.storage.ShuffleIndexBlockId
shuffleManager() - Method in class org.apache.spark.SparkEnv
shuffleRead() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
shuffleReadBytes() - Method in class org.apache.spark.status.api.v1.StageData
ShuffleReadMetricDistributions - Class in org.apache.spark.status.api.v1
ShuffleReadMetrics - Class in org.apache.spark.status.api.v1
shuffleReadMetrics() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
shuffleReadMetrics() - Method in class org.apache.spark.status.api.v1.TaskMetrics
shuffleReadRecords() - Method in class org.apache.spark.status.api.v1.StageData
shuffleWrite() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
shuffleWriteBytes() - Method in class org.apache.spark.status.api.v1.StageData
ShuffleWriteMetricDistributions - Class in org.apache.spark.status.api.v1
ShuffleWriteMetrics - Class in org.apache.spark.status.api.v1
shuffleWriteMetrics() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
shuffleWriteMetrics() - Method in class org.apache.spark.status.api.v1.TaskMetrics
shuffleWriteRecords() - Method in class org.apache.spark.status.api.v1.StageData
sigma() - Method in class org.apache.spark.mllib.stat.distribution.MultivariateGaussian
sigmas() - Method in class org.apache.spark.mllib.clustering.ExpectationSum
SignalLoggerHandler - Class in org.apache.spark.util
SignalLoggerHandler(String, Logger) - Constructor for class org.apache.spark.util.SignalLoggerHandler
signum(Column) - Static method in class org.apache.spark.sql.functions: Computes the signum of the given value.
signum(String) - Static method in class org.apache.spark.sql.functions: Computes the signum of the given column.
SimpleFutureAction<T> - Class in org.apache.spark: A FutureAction holding the result of an action that triggers a single job.
simpleString() - Method in class org.apache.spark.sql.hive.HiveContext.QueryExecution
simpleString() - Method in class org.apache.spark.sql.types.ArrayType
simpleString() - Method in class org.apache.spark.sql.types.ByteType
simpleString() - Method in class org.apache.spark.sql.types.DataType: Readable string representation for the type.
simpleString() - Method in class org.apache.spark.sql.types.DecimalType
simpleString() - Method in class org.apache.spark.sql.types.IntegerType
simpleString() - Method in class org.apache.spark.sql.types.LongType
simpleString() - Method in class org.apache.spark.sql.types.MapType
simpleString() - Method in class org.apache.spark.sql.types.ShortType
simpleString() - Method in class org.apache.spark.sql.types.StructType
SimpleUpdater - Class in org.apache.spark.mllib.optimization: :: DeveloperApi :: A simple updater for gradient descent *without* any regularization.
SimpleUpdater() - Constructor for class org.apache.spark.mllib.optimization.SimpleUpdater
SIMR_REGEX() - Static method in class org.apache.spark.SparkMasterRegex
sin(Column) - Static method in class org.apache.spark.sql.functions: Computes the sine of the given value.
sin(String) - Static method in class org.apache.spark.sql.functions: Computes the sine of the given column.
SingularValueDecomposition<UType,VType> - Class in org.apache.spark.mllib.linalg: Represents singular value decomposition (SVD) factors.
SingularValueDecomposition(UType, Vector, VType) - Constructor for class org.apache.spark.mllib.linalg.SingularValueDecomposition
sinh(Column) - Static method in class org.apache.spark.sql.functions: Computes the hyperbolic sine of the given value.
sinh(String) - Static method in class org.apache.spark.sql.functions: Computes the hyperbolic sine of the given column.
size() - Method in class org.apache.spark.ml.attribute.AttributeGroup: Size of the attribute group.
size() - Method in class org.apache.spark.ml.param.ParamMap: Number of param pairs in this map.
size() - Method in class org.apache.spark.mllib.linalg.DenseVector
size() - Method in class org.apache.spark.mllib.linalg.SparseVector
size() - Method in interface org.apache.spark.mllib.linalg.Vector: Size of the vector.
size() - Method in class org.apache.spark.rdd.PartitionGroup
size(Column) - Static method in class org.apache.spark.sql.functions: Returns length of array or map.
size() - Method in interface org.apache.spark.sql.Row: Number of elements in the Row.
size() - Method in class org.apache.spark.storage.MemoryEntry
SizeEstimator - Class in org.apache.spark.util: :: DeveloperApi :: Estimates the sizes of Java objects (number of bytes of memory they occupy), for use in memory-aware caches.
SizeEstimator() - Constructor for class org.apache.spark.util.SizeEstimator
sizeInBytes() - Method in class org.apache.spark.sql.sources.BaseRelation: Returns an estimated size of this relation in bytes.
sizeInBytes() - Method in class org.apache.spark.sql.sources.HadoopFsRelation
sketch(RDD<K>, int, ClassTag<K>) - Static method in class org.apache.spark.RangePartitioner: Sketches the input RDD via reservoir sampling on each partition.
skewness(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the skewness of the values in a group.
skewness(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the skewness of the values in a group.
skip(long) - Method in class org.apache.spark.storage.BufferReleasingInputStream
skippedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
slack() - Method in class org.apache.spark.rdd.PartitionCoalescer
slice(Time, Time) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return all the RDDs between 'fromDuration' to 'toDuration' (both included)
slice(org.apache.spark.streaming.Interval) - Method in class org.apache.spark.streaming.dstream.DStream: Return all the RDDs defined by the Interval object (both end times included)
slice(Time, Time) - Method in class org.apache.spark.streaming.dstream.DStream: Return all the RDDs between 'fromTime' to 'toTime' (both included)
slideDuration() - Method in class org.apache.spark.streaming.dstream.DStream: Time interval after which the DStream generates a RDD
slideDuration() - Method in class org.apache.spark.streaming.dstream.InputDStream
sliding(int, int) - Method in class org.apache.spark.mllib.rdd.RDDFunctions: Returns a RDD from grouping items of its parent RDD in fixed size blocks by passing a sliding window over them.
sliding(int) - Method in class org.apache.spark.mllib.rdd.RDDFunctions: sliding(Int, Int)* with step = 1.
SnappyCompressionCodec - Class in org.apache.spark.io: :: DeveloperApi :: Snappy implementation of CompressionCodec.
SnappyCompressionCodec(SparkConf) - Constructor for class org.apache.spark.io.SnappyCompressionCodec
SnappyOutputStreamWrapper - Class in org.apache.spark.io: Wrapper over SnappyOutputStream which guards against write-after-close and double-close issues.
SnappyOutputStreamWrapper(SnappyOutputStream) - Constructor for class org.apache.spark.io.SnappyOutputStreamWrapper
socketStream(String, int, Function<InputStream, Iterable<T>>, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream from network source hostname:port.
socketStream(String, int, Function1<InputStream, Iterator<T>>, StorageLevel, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext: Create a input stream from TCP source hostname:port.
socketTextStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream from network source hostname:port.
socketTextStream(String, int) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream from network source hostname:port.
socketTextStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.StreamingContext: Create a input stream from TCP source hostname:port.
Sort() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
sort(String, String...) - Method in class org.apache.spark.sql.DataFrame: Returns a new DataFrame sorted by the specified column, all in ascending order.
sort(Column...) - Method in class org.apache.spark.sql.DataFrame: Returns a new DataFrame sorted by the given expressions.
sort(String, Seq<String>) - Method in class org.apache.spark.sql.DataFrame: Returns a new DataFrame sorted by the specified column, all in ascending order.
sort(Seq<Column>) - Method in class org.apache.spark.sql.DataFrame: Returns a new DataFrame sorted by the given expressions.
sort_array(Column) - Static method in class org.apache.spark.sql.functions: Sorts the input array for the given column in ascending order, according to the natural ordering of the array elements.
sort_array(Column, boolean) - Static method in class org.apache.spark.sql.functions: Sorts the input array for the given column in ascending / descending order, according to the natural ordering of the array elements.
sortBy(Function<T, S>, boolean, int) - Method in class org.apache.spark.api.java.JavaRDD: Return this RDD sorted by the given key function.
sortBy(Function1<T, K>, boolean, int, Ordering<K>, ClassTag<K>) - Method in class org.apache.spark.rdd.RDD: Return this RDD sorted by the given key function.
sortByKey() - Method in class org.apache.spark.api.java.JavaPairRDD: Sort the RDD by key, so that each partition contains a sorted range of the elements in ascending order.
sortByKey(boolean) - Method in class org.apache.spark.api.java.JavaPairRDD: Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortByKey(boolean, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortByKey(Comparator<K>) - Method in class org.apache.spark.api.java.JavaPairRDD: Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortByKey(Comparator<K>, boolean) - Method in class org.apache.spark.api.java.JavaPairRDD: Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortByKey(Comparator<K>, boolean, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortByKey(boolean, int) - Method in class org.apache.spark.rdd.OrderedRDDFunctions: Sort the RDD by key, so that each partition contains a sorted range of the elements.
sortWithinPartitions(String, String...) - Method in class org.apache.spark.sql.DataFrame: Returns a new DataFrame with each partition sorted by the given expressions.
sortWithinPartitions(Column...) - Method in class org.apache.spark.sql.DataFrame: Returns a new DataFrame with each partition sorted by the given expressions.
sortWithinPartitions(String, Seq<String>) - Method in class org.apache.spark.sql.DataFrame: Returns a new DataFrame with each partition sorted by the given expressions.
sortWithinPartitions(Seq<Column>) - Method in class org.apache.spark.sql.DataFrame: Returns a new DataFrame with each partition sorted by the given expressions.
soundex(Column) - Static method in class org.apache.spark.sql.functions: * Return the soundex code for the specified expression.
SPARK_JOB_DESCRIPTION() - Static method in class org.apache.spark.SparkContext
SPARK_JOB_GROUP_ID() - Static method in class org.apache.spark.SparkContext
SPARK_JOB_INTERRUPT_ON_CANCEL() - Static method in class org.apache.spark.SparkContext
SPARK_MASTER - Static variable in class org.apache.spark.launcher.SparkLauncher: The Spark master.
spark_partition_id() - Static method in class org.apache.spark.sql.functions: Partition ID of the Spark task.
SPARK_REGEX() - Static method in class org.apache.spark.SparkMasterRegex
SparkAppHandle - Interface in org.apache.spark.launcher: A handle to a running Spark application.
SparkAppHandle.Listener - Interface in org.apache.spark.launcher: Listener for updates to a handle's state.
SparkAppHandle.State - Enum in org.apache.spark.launcher: Represents the application's state.
SparkConf - Class in org.apache.spark: Configuration for a Spark application.
SparkConf(boolean) - Constructor for class org.apache.spark.SparkConf
SparkConf() - Constructor for class org.apache.spark.SparkConf: Create a SparkConf that loads defaults from system properties and the classpath
sparkContext() - Method in class org.apache.spark.rdd.RDD: The SparkContext that created this RDD.
SparkContext - Class in org.apache.spark: Main entry point for Spark functionality.
SparkContext(SparkConf) - Constructor for class org.apache.spark.SparkContext
SparkContext() - Constructor for class org.apache.spark.SparkContext: Create a SparkContext that loads settings from system properties (for instance, when launching with ./bin/spark-submit).
SparkContext(SparkConf, Map<String, Set<SplitInfo>>) - Constructor for class org.apache.spark.SparkContext: :: DeveloperApi :: Alternative constructor for setting preferred locations where Spark will create executors.
SparkContext(String, String, SparkConf) - Constructor for class org.apache.spark.SparkContext: Alternative constructor that allows setting common Spark properties directly
SparkContext(String, String, String, Seq<String>, Map<String, String>, Map<String, Set<SplitInfo>>) - Constructor for class org.apache.spark.SparkContext: Alternative constructor that allows setting common Spark properties directly
sparkContext() - Method in class org.apache.spark.sql.SQLContext
sparkContext() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: The underlying SparkContext
sparkContext() - Method in class org.apache.spark.streaming.StreamingContext: Return the associated Spark context
SparkContext.DoubleAccumulatorParam$ - Class in org.apache.spark
SparkContext.DoubleAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.DoubleAccumulatorParam$
SparkContext.FloatAccumulatorParam$ - Class in org.apache.spark
SparkContext.FloatAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.FloatAccumulatorParam$
SparkContext.IntAccumulatorParam$ - Class in org.apache.spark
SparkContext.IntAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.IntAccumulatorParam$
SparkContext.LongAccumulatorParam$ - Class in org.apache.spark
SparkContext.LongAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.LongAccumulatorParam$
SparkEnv - Class in org.apache.spark: :: DeveloperApi :: Holds all the runtime environment objects for a running Spark instance (either master or worker), including the serializer, Akka actor system, block manager, map output tracker, etc.
SparkEnv(String, org.apache.spark.rpc.RpcEnv, ActorSystem, Serializer, Serializer, CacheManager, MapOutputTracker, ShuffleManager, org.apache.spark.broadcast.BroadcastManager, BlockTransferService, org.apache.spark.storage.BlockManager, SecurityManager, String, org.apache.spark.metrics.MetricsSystem, MemoryManager, org.apache.spark.scheduler.OutputCommitCoordinator, SparkConf) - Constructor for class org.apache.spark.SparkEnv
SparkException - Exception in org.apache.spark
SparkException(String, Throwable) - Constructor for exception org.apache.spark.SparkException
SparkException(String) - Constructor for exception org.apache.spark.SparkException
SparkFiles - Class in org.apache.spark: Resolves paths to files added through SparkContext.addFile().
SparkFiles() - Constructor for class org.apache.spark.SparkFiles
sparkFilesDir() - Method in class org.apache.spark.SparkEnv
SparkFirehoseListener - Class in org.apache.spark: Class that allows users to receive all SparkListener events.
SparkFirehoseListener() - Constructor for class org.apache.spark.SparkFirehoseListener
SparkFlumeEvent - Class in org.apache.spark.streaming.flume: A wrapper class for AvroFlumeEvent's with a custom serialization format.
SparkFlumeEvent() - Constructor for class org.apache.spark.streaming.flume.SparkFlumeEvent
SparkJobInfo - Interface in org.apache.spark: Exposes information about Spark Jobs.
SparkJobInfoImpl - Class in org.apache.spark
SparkJobInfoImpl(int, int[], JobExecutionStatus) - Constructor for class org.apache.spark.SparkJobInfoImpl
SparkLauncher - Class in org.apache.spark.launcher: Launcher for Spark applications.
SparkLauncher() - Constructor for class org.apache.spark.launcher.SparkLauncher
SparkLauncher(Map<String, String>) - Constructor for class org.apache.spark.launcher.SparkLauncher: Creates a launcher that will set the given environment variables in the child.
SparkListener - Interface in org.apache.spark.scheduler: :: DeveloperApi :: Interface for listening to events from the Spark scheduler.
SparkListenerApplicationEnd - Class in org.apache.spark.scheduler
SparkListenerApplicationEnd(long) - Constructor for class org.apache.spark.scheduler.SparkListenerApplicationEnd
SparkListenerApplicationStart - Class in org.apache.spark.scheduler
SparkListenerApplicationStart(String, Option<String>, long, String, Option<String>, Option<Map<String, String>>) - Constructor for class org.apache.spark.scheduler.SparkListenerApplicationStart
SparkListenerBlockManagerAdded - Class in org.apache.spark.scheduler
SparkListenerBlockManagerAdded(long, BlockManagerId, long) - Constructor for class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
SparkListenerBlockManagerRemoved - Class in org.apache.spark.scheduler
SparkListenerBlockManagerRemoved(long, BlockManagerId) - Constructor for class org.apache.spark.scheduler.SparkListenerBlockManagerRemoved
SparkListenerBlockUpdated - Class in org.apache.spark.scheduler
SparkListenerBlockUpdated(BlockUpdatedInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerBlockUpdated
SparkListenerEnvironmentUpdate - Class in org.apache.spark.scheduler
SparkListenerEnvironmentUpdate(Map<String, Seq<Tuple2<String, String>>>) - Constructor for class org.apache.spark.scheduler.SparkListenerEnvironmentUpdate
SparkListenerEvent - Interface in org.apache.spark.scheduler
SparkListenerExecutorAdded - Class in org.apache.spark.scheduler
SparkListenerExecutorAdded(long, String, ExecutorInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerExecutorAdded
SparkListenerExecutorMetricsUpdate - Class in org.apache.spark.scheduler: Periodic updates from executors.
SparkListenerExecutorMetricsUpdate(String, Seq<Tuple4<Object, Object, Object, TaskMetrics>>) - Constructor for class org.apache.spark.scheduler.SparkListenerExecutorMetricsUpdate
SparkListenerExecutorRemoved - Class in org.apache.spark.scheduler
SparkListenerExecutorRemoved(long, String, String) - Constructor for class org.apache.spark.scheduler.SparkListenerExecutorRemoved
SparkListenerJobEnd - Class in org.apache.spark.scheduler
SparkListenerJobEnd(int, long, JobResult) - Constructor for class org.apache.spark.scheduler.SparkListenerJobEnd
SparkListenerJobStart - Class in org.apache.spark.scheduler
SparkListenerJobStart(int, long, Seq<StageInfo>, Properties) - Constructor for class org.apache.spark.scheduler.SparkListenerJobStart
SparkListenerStageCompleted - Class in org.apache.spark.scheduler
SparkListenerStageCompleted(StageInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerStageCompleted
SparkListenerStageSubmitted - Class in org.apache.spark.scheduler
SparkListenerStageSubmitted(StageInfo, Properties) - Constructor for class org.apache.spark.scheduler.SparkListenerStageSubmitted
SparkListenerTaskEnd - Class in org.apache.spark.scheduler
SparkListenerTaskEnd(int, int, String, TaskEndReason, TaskInfo, TaskMetrics) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskEnd
SparkListenerTaskGettingResult - Class in org.apache.spark.scheduler
SparkListenerTaskGettingResult(TaskInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskGettingResult
SparkListenerTaskStart - Class in org.apache.spark.scheduler
SparkListenerTaskStart(int, int, TaskInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskStart
SparkListenerUnpersistRDD - Class in org.apache.spark.scheduler
SparkListenerUnpersistRDD(int) - Constructor for class org.apache.spark.scheduler.SparkListenerUnpersistRDD
SparkMasterRegex - Class in org.apache.spark: A collection of regexes for extracting information from the master string.
SparkMasterRegex() - Constructor for class org.apache.spark.SparkMasterRegex
sparkPartitionId() - Static method in class org.apache.spark.sql.functions: Deprecated.
As of 1.6.0, replaced by spark_partition_id. This will be removed in Spark 2.0.
sparkProperties() - Method in class org.apache.spark.ui.env.EnvironmentListener
SparkShutdownHook - Class in org.apache.spark.util
SparkShutdownHook(int, Function0<BoxedUnit>) - Constructor for class org.apache.spark.util.SparkShutdownHook
SparkStageInfo - Interface in org.apache.spark: Exposes information about Spark Stages.
SparkStageInfoImpl - Class in org.apache.spark
SparkStageInfoImpl(int, int, long, String, int, int, int, int) - Constructor for class org.apache.spark.SparkStageInfoImpl
SparkStatusTracker - Class in org.apache.spark: Low-level status reporting APIs for monitoring job and stage progress.
sparkUser() - Method in class org.apache.spark.api.java.JavaSparkContext
sparkUser() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
sparkUser() - Method in class org.apache.spark.SparkContext
sparkUser() - Method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
sparse(int, int, int[], int[], double[]) - Static method in class org.apache.spark.mllib.linalg.Matrices: Creates a column-major sparse matrix in Compressed Sparse Column (CSC) format.
sparse(int, int[], double[]) - Static method in class org.apache.spark.mllib.linalg.Vectors: Creates a sparse vector providing its index array and value array.
sparse(int, Seq<Tuple2<Object, Object>>) - Static method in class org.apache.spark.mllib.linalg.Vectors: Creates a sparse vector using unordered (index, value) pairs.
sparse(int, Iterable<Tuple2<Integer, Double>>) - Static method in class org.apache.spark.mllib.linalg.Vectors: Creates a sparse vector using unordered (index, value) pairs in a Java friendly way.
SparseMatrix - Class in org.apache.spark.mllib.linalg: Column-major sparse matrix.
SparseMatrix(int, int, int[], int[], double[], boolean) - Constructor for class org.apache.spark.mllib.linalg.SparseMatrix
SparseMatrix(int, int, int[], int[], double[]) - Constructor for class org.apache.spark.mllib.linalg.SparseMatrix: Column-major sparse matrix.
SparseVector - Class in org.apache.spark.mllib.linalg: A sparse vector represented by an index array and an value array.
SparseVector(int, int[], double[]) - Constructor for class org.apache.spark.mllib.linalg.SparseVector
sparsity() - Method in class org.apache.spark.ml.attribute.NumericAttribute
spdiag(Vector) - Static method in class org.apache.spark.mllib.linalg.SparseMatrix: Generate a diagonal matrix in SparseMatrix format from the supplied values.
SpecialLengths - Class in org.apache.spark.api.r
SpecialLengths() - Constructor for class org.apache.spark.api.r.SpecialLengths
speculative() - Method in class org.apache.spark.scheduler.TaskInfo
speculative() - Method in class org.apache.spark.status.api.v1.TaskData
speye(int) - Static method in class org.apache.spark.mllib.linalg.Matrices: Generate a sparse Identity Matrix in Matrix format.
speye(int) - Static method in class org.apache.spark.mllib.linalg.SparseMatrix: Generate an Identity Matrix in SparseMatrix format.
SpillListener - Class in org.apache.spark: A SparkListener that detects whether spills have occurred in Spark jobs.
SpillListener() - Constructor for class org.apache.spark.SpillListener
split() - Method in class org.apache.spark.ml.tree.InternalNode
Split - Interface in org.apache.spark.ml.tree: :: DeveloperApi :: Interface for a "Split," which specifies a test made at a decision tree node to choose the left or right path.
split() - Method in class org.apache.spark.mllib.tree.model.Node
Split - Class in org.apache.spark.mllib.tree.model: :: DeveloperApi :: Split applied to a feature param: feature feature index param: threshold Threshold for continuous feature.
Split(int, double, Enumeration.Value, List<Object>) - Constructor for class org.apache.spark.mllib.tree.model.Split
split(Column, String) - Static method in class org.apache.spark.sql.functions: Splits str around pattern (pattern is a regular expression).
SPLIT_INFO_REFLECTIONS() - Static method in class org.apache.spark.rdd.HadoopRDD
splitIndex() - Method in class org.apache.spark.storage.RDDBlockId
SplitInfo - Class in org.apache.spark.scheduler
SplitInfo(Class<?>, String, String, long, Object) - Constructor for class org.apache.spark.scheduler.SplitInfo
splits() - Method in interface org.apache.spark.api.java.JavaRDDLike
splits() - Method in class org.apache.spark.ml.feature.Bucketizer: Parameter for mapping continuous features into buckets.
sprand(int, int, double, Random) - Static method in class org.apache.spark.mllib.linalg.Matrices: Generate a SparseMatrix consisting of i.i.d. gaussian random numbers.
sprand(int, int, double, Random) - Static method in class org.apache.spark.mllib.linalg.SparseMatrix: Generate a SparseMatrix consisting of i.i.d.
sprandn(int, int, double, Random) - Static method in class org.apache.spark.mllib.linalg.Matrices: Generate a SparseMatrix consisting of i.i.d. gaussian random numbers.
sprandn(int, int, double, Random) - Static method in class org.apache.spark.mllib.linalg.SparseMatrix: Generate a SparseMatrix consisting of i.i.d.
sqdist(Vector, Vector) - Static method in class org.apache.spark.mllib.linalg.Vectors: Returns the squared distance between two Vectors.
sql(String) - Method in class org.apache.spark.sql.SQLContext
sqlContext() - Method in class org.apache.spark.ml.clustering.LDAModel
sqlContext() - Method in class org.apache.spark.sql.DataFrame
sqlContext() - Method in class org.apache.spark.sql.Dataset
sqlContext() - Method in class org.apache.spark.sql.sources.BaseRelation
SQLContext - Class in org.apache.spark.sql: The entry point for working with structured data (rows and columns) in Spark.
SQLContext(SparkContext) - Constructor for class org.apache.spark.sql.SQLContext
SQLContext(JavaSparkContext) - Constructor for class org.apache.spark.sql.SQLContext
SQLContext.implicits$ - Class in org.apache.spark.sql: :: Experimental :: (Scala-specific) Implicit methods available in Scala for converting common Scala objects into DataFrames.
SQLContext.implicits$() - Constructor for class org.apache.spark.sql.SQLContext.implicits$
SQLContext.implicits$.StringToColumn - Class in org.apache.spark.sql: Converts $"col name" into an Column.
SQLContext.implicits$.StringToColumn(StringContext) - Constructor for class org.apache.spark.sql.SQLContext.implicits$.StringToColumn
SQLContext.QueryExecution - Class in org.apache.spark.sql
SQLContext.QueryExecution(LogicalPlan) - Constructor for class org.apache.spark.sql.SQLContext.QueryExecution
SQLContext.SparkPlanner - Class in org.apache.spark.sql
SQLContext.SparkPlanner() - Constructor for class org.apache.spark.sql.SQLContext.SparkPlanner
SQLImplicits - Class in org.apache.spark.sql: A collection of implicit methods for converting common Scala objects into DataFrames.
SQLImplicits() - Constructor for class org.apache.spark.sql.SQLImplicits
sqlParser() - Method in class org.apache.spark.sql.SQLContext
SQLTransformer - Class in org.apache.spark.ml.feature: :: Experimental :: Implements the transformations which are defined by SQL statement.
SQLTransformer(String) - Constructor for class org.apache.spark.ml.feature.SQLTransformer
SQLTransformer() - Constructor for class org.apache.spark.ml.feature.SQLTransformer
sqlType() - Method in class org.apache.spark.mllib.linalg.VectorUDT
sqlType() - Method in class org.apache.spark.sql.types.UserDefinedType: Underlying storage type for this UDT
SQLUserDefinedType - Annotation Type in org.apache.spark.sql.types: ::DeveloperApi:: A user-defined type which can be automatically recognized by a SQLContext and registered.
sqrt(Column) - Static method in class org.apache.spark.sql.functions: Computes the square root of the specified float value.
sqrt(String) - Static method in class org.apache.spark.sql.functions: Computes the square root of the specified float value.
squaredDist(Vector) - Method in class org.apache.spark.util.Vector
SquaredError - Class in org.apache.spark.mllib.tree.loss: :: DeveloperApi :: Class for squared error loss calculation.
SquaredError() - Constructor for class org.apache.spark.mllib.tree.loss.SquaredError
SquaredL2Updater - Class in org.apache.spark.mllib.optimization: :: DeveloperApi :: Updater for L2 regularized problems.
SquaredL2Updater() - Constructor for class org.apache.spark.mllib.optimization.SquaredL2Updater
Src - Static variable in class org.apache.spark.graphx.TripletFields: Expose the source and edge fields but not the destination field.
srcAttr() - Method in class org.apache.spark.graphx.EdgeContext: The vertex attribute of the edge's source vertex.
srcAttr() - Method in class org.apache.spark.graphx.EdgeTriplet: The source vertex attribute
srcAttr() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
srcId() - Method in class org.apache.spark.graphx.Edge
srcId() - Method in class org.apache.spark.graphx.EdgeContext: The vertex id of the edge's source vertex.
srcId() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
srdd() - Method in class org.apache.spark.api.java.JavaDoubleRDD
ssc() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
ssc() - Method in class org.apache.spark.streaming.dstream.DStream
stackTrace() - Method in class org.apache.spark.ExceptionFailure
stage() - Method in class org.apache.spark.scheduler.AskPermissionToCommitOutput
stageAttemptId() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
stageAttemptId() - Method in class org.apache.spark.scheduler.SparkListenerTaskStart
StageData - Class in org.apache.spark.status.api.v1
stageFailed(String) - Method in class org.apache.spark.scheduler.StageInfo
stageId() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
stageId() - Method in class org.apache.spark.scheduler.SparkListenerTaskStart
stageId() - Method in class org.apache.spark.scheduler.StageInfo
stageId() - Method in interface org.apache.spark.SparkStageInfo
stageId() - Method in class org.apache.spark.SparkStageInfoImpl
stageId() - Method in class org.apache.spark.status.api.v1.StageData
stageId() - Method in class org.apache.spark.TaskContext: The ID of the stage that this task belong to.
stageIds() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
stageIds() - Method in interface org.apache.spark.SparkJobInfo
stageIds() - Method in class org.apache.spark.SparkJobInfoImpl
stageIds() - Method in class org.apache.spark.status.api.v1.JobData
stageIdToActiveJobIds() - Method in class org.apache.spark.ui.jobs.JobProgressListener
stageIdToData() - Method in class org.apache.spark.ui.jobs.JobProgressListener
stageIdToInfo() - Method in class org.apache.spark.ui.jobs.JobProgressListener
stageInfo() - Method in class org.apache.spark.scheduler.SparkListenerStageCompleted
stageInfo() - Method in class org.apache.spark.scheduler.SparkListenerStageSubmitted
StageInfo - Class in org.apache.spark.scheduler: :: DeveloperApi :: Stores information about a stage to pass from the scheduler to SparkListeners.
StageInfo(int, int, String, int, Seq<RDDInfo>, Seq<Object>, String, Seq<Seq<TaskLocation>>) - Constructor for class org.apache.spark.scheduler.StageInfo
stageInfos() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
stageLogInfo(int, String, boolean) - Method in class org.apache.spark.scheduler.JobLogger: Write info into log file
stages() - Method in class org.apache.spark.ml.Pipeline: param for pipeline stages
stages() - Method in class org.apache.spark.ml.PipelineModel
StageStatus - Enum in org.apache.spark.status.api.v1
StandardNormalGenerator - Class in org.apache.spark.mllib.random: :: DeveloperApi :: Generates i.i.d.
StandardNormalGenerator() - Constructor for class org.apache.spark.mllib.random.StandardNormalGenerator
StandardScaler - Class in org.apache.spark.ml.feature: :: Experimental :: Standardizes features by removing the mean and scaling to unit variance using column summary statistics on the samples in the training set.
StandardScaler(String) - Constructor for class org.apache.spark.ml.feature.StandardScaler
StandardScaler() - Constructor for class org.apache.spark.ml.feature.StandardScaler
StandardScaler - Class in org.apache.spark.mllib.feature: Standardizes features by removing the mean and scaling to unit std using column summary statistics on the samples in the training set.
StandardScaler(boolean, boolean) - Constructor for class org.apache.spark.mllib.feature.StandardScaler
StandardScaler() - Constructor for class org.apache.spark.mllib.feature.StandardScaler
StandardScalerModel - Class in org.apache.spark.ml.feature
StandardScalerModel - Class in org.apache.spark.mllib.feature: Represents a StandardScaler model that can transform vectors.
StandardScalerModel(Vector, Vector, boolean, boolean) - Constructor for class org.apache.spark.mllib.feature.StandardScalerModel
StandardScalerModel(Vector, Vector) - Constructor for class org.apache.spark.mllib.feature.StandardScalerModel
StandardScalerModel(Vector) - Constructor for class org.apache.spark.mllib.feature.StandardScalerModel
starGraph(SparkContext, int) - Static method in class org.apache.spark.graphx.util.GraphGenerators: Create a star graph with vertex 0 being the center.
start() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Start the execution of the streams.
start() - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
start() - Method in class org.apache.spark.streaming.dstream.InputDStream: Method called to start receiving data.
start() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
start() - Method in class org.apache.spark.streaming.StreamingContext: Start the execution of the streams.
startApplication(SparkAppHandle.Listener...) - Method in class org.apache.spark.launcher.SparkLauncher: Starts a Spark application.
startIndexInLevel(int) - Static method in class org.apache.spark.mllib.tree.model.Node: Return the index of the first node in the given level.
startPosition() - Method in exception org.apache.spark.sql.AnalysisException
startsWith(Column) - Method in class org.apache.spark.sql.Column: String starts with.
startsWith(String) - Method in class org.apache.spark.sql.Column: String starts with another string literal.
startTime() - Method in class org.apache.spark.api.java.JavaSparkContext
startTime() - Method in class org.apache.spark.SparkContext
startTime() - Method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
startTime() - Method in class org.apache.spark.streaming.scheduler.OutputOperationInfo
startTime() - Method in class org.apache.spark.ui.jobs.JobProgressListener
stat() - Method in class org.apache.spark.sql.DataFrame: Returns a DataFrameStatFunctions for working statistic functions support.
StatCounter - Class in org.apache.spark.util: A class for tracking the statistics of a set of numbers (count, mean and variance) in a numerically robust way.
StatCounter(TraversableOnce<Object>) - Constructor for class org.apache.spark.util.StatCounter
StatCounter() - Constructor for class org.apache.spark.util.StatCounter: Initialize the StatCounter with no values.
state() - Method in class org.apache.spark.scheduler.local.StatusUpdate
State<S> - Class in org.apache.spark.streaming: :: Experimental :: Abstract class for getting and updating the state in mapping function used in the mapWithState operation of a pair DStream (Scala) or a JavaPairDStream (Java).
State() - Constructor for class org.apache.spark.streaming.State
stateChanged(SparkAppHandle) - Method in interface org.apache.spark.launcher.SparkAppHandle.Listener: Callback for changes in the handle's state.
statement() - Method in class org.apache.spark.ml.feature.SQLTransformer: SQL statement parameter.
stateSnapshots() - Method in class org.apache.spark.streaming.api.java.JavaMapWithStateDStream
stateSnapshots() - Method in class org.apache.spark.streaming.dstream.MapWithStateDStream: Return a pair DStream where each RDD is the snapshot of the state of all the keys.
StateSpec<KeyType,ValueType,StateType,MappedType> - Class in org.apache.spark.streaming: :: Experimental :: Abstract class representing all the specifications of the DStream transformation mapWithState operation of a pair DStream (Scala) or a JavaPairDStream (Java).
StateSpec() - Constructor for class org.apache.spark.streaming.StateSpec
staticPageRank(int, double) - Method in class org.apache.spark.graphx.GraphOps: Run PageRank for a fixed number of iterations returning a graph with vertex attributes containing the PageRank and edge attributes the normalized edge weight.
staticPersonalizedPageRank(long, int, double) - Method in class org.apache.spark.graphx.GraphOps: Run Personalized PageRank for a fixed number of iterations with with all iterations originating at the source node returning a graph with vertex attributes containing the PageRank and edge attributes the normalized edge weight.
statistic() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
statistic() - Method in class org.apache.spark.mllib.stat.test.KolmogorovSmirnovTestResult
statistic() - Method in interface org.apache.spark.mllib.stat.test.TestResult: Test statistic.
Statistics - Class in org.apache.spark.mllib.stat
Statistics() - Constructor for class org.apache.spark.mllib.stat.Statistics
Statistics - Class in org.apache.spark.streaming.receiver: :: DeveloperApi :: Statistics for querying the supervisor about state of workers.
Statistics(int, int, int, String) - Constructor for class org.apache.spark.streaming.receiver.Statistics
stats() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return a StatCounter object that captures the mean, variance and count of the RDD's elements in one operation.
stats() - Method in class org.apache.spark.mllib.tree.model.Node
stats() - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Return a StatCounter object that captures the mean, variance and count of the RDD's elements in one operation.
StatsReportListener - Class in org.apache.spark.scheduler: :: DeveloperApi :: Simple SparkListener that logs a few summary statistics when each stage completes
StatsReportListener() - Constructor for class org.apache.spark.scheduler.StatsReportListener
StatsReportListener - Class in org.apache.spark.streaming.scheduler: :: DeveloperApi :: A simple StreamingListener that logs summary statistics across Spark Streaming batches param: numBatchInfos Number of last batches to consider for generating statistics (default: 10)
StatsReportListener(int) - Constructor for class org.apache.spark.streaming.scheduler.StatsReportListener
status() - Method in class org.apache.spark.scheduler.TaskInfo
status() - Method in interface org.apache.spark.SparkJobInfo
status() - Method in class org.apache.spark.SparkJobInfoImpl
status() - Method in class org.apache.spark.status.api.v1.JobData
status() - Method in class org.apache.spark.status.api.v1.StageData
statusTracker() - Method in class org.apache.spark.api.java.JavaSparkContext
statusTracker() - Method in class org.apache.spark.SparkContext
StatusUpdate - Class in org.apache.spark.scheduler.local
StatusUpdate(long, Enumeration.Value, ByteBuffer) - Constructor for class org.apache.spark.scheduler.local.StatusUpdate
std() - Method in class org.apache.spark.ml.attribute.NumericAttribute
std() - Method in class org.apache.spark.ml.feature.StandardScalerModel
std() - Method in class org.apache.spark.mllib.feature.StandardScalerModel
std() - Method in class org.apache.spark.mllib.random.LogNormalGenerator
stddev(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: alias for stddev_samp.
stddev(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: alias for stddev_samp.
stddev_pop(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the population standard deviation of the expression in a group.
stddev_pop(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the population standard deviation of the expression in a group.
stddev_samp(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the sample standard deviation of the expression in a group.
stddev_samp(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the sample standard deviation of the expression in a group.
stdev() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Compute the standard deviation of this RDD's elements.
stdev() - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Compute the standard deviation of this RDD's elements.
stdev() - Method in class org.apache.spark.util.StatCounter: Return the standard deviation of the values.
stop() - Method in class org.apache.spark.api.java.JavaSparkContext: Shut down the SparkContext.
stop() - Method in interface org.apache.spark.broadcast.BroadcastFactory
stop() - Method in class org.apache.spark.broadcast.HttpBroadcastFactory
stop() - Method in class org.apache.spark.broadcast.TorrentBroadcastFactory
stop() - Method in interface org.apache.spark.launcher.SparkAppHandle: Asks the application to stop.
stop() - Method in class org.apache.spark.SparkContext
stop() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Stop the execution of the streams.
stop(boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Stop the execution of the streams.
stop(boolean, boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Stop the execution of the streams.
stop() - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
stop() - Method in class org.apache.spark.streaming.dstream.InputDStream: Method called to stop receiving data.
stop() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
stop(String) - Method in class org.apache.spark.streaming.receiver.Receiver: Stop the receiver completely.
stop(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver: Stop the receiver completely due to an exception
stop(boolean) - Method in class org.apache.spark.streaming.StreamingContext: Stop the execution of the streams immediately (does not wait for all received data to be processed).
stop(boolean, boolean) - Method in class org.apache.spark.streaming.StreamingContext: Stop the execution of the streams, with option of ensuring all received data has been processed.
StopCoordinator - Class in org.apache.spark.scheduler
StopCoordinator() - Constructor for class org.apache.spark.scheduler.StopCoordinator
StopExecutor - Class in org.apache.spark.scheduler.local
StopExecutor() - Constructor for class org.apache.spark.scheduler.local.StopExecutor
stopped() - Method in class org.apache.spark.SparkContext
stopWords() - Method in class org.apache.spark.ml.feature.StopWordsRemover: the stop words set to be filtered out Default: StopWords.English
StopWordsRemover - Class in org.apache.spark.ml.feature: :: Experimental :: A feature transformer that filters out stop words from input.
StopWordsRemover(String) - Constructor for class org.apache.spark.ml.feature.StopWordsRemover
StopWordsRemover() - Constructor for class org.apache.spark.ml.feature.StopWordsRemover
storageLevel() - Method in class org.apache.spark.status.api.v1.RDDPartitionInfo
storageLevel() - Method in class org.apache.spark.status.api.v1.RDDStorageInfo
storageLevel() - Method in class org.apache.spark.storage.BlockStatus
storageLevel() - Method in class org.apache.spark.storage.BlockUpdatedInfo
storageLevel() - Method in class org.apache.spark.storage.RDDInfo
StorageLevel - Class in org.apache.spark.storage: :: DeveloperApi :: Flags for controlling the storage of an RDD.
StorageLevel() - Constructor for class org.apache.spark.storage.StorageLevel
storageLevel() - Method in class org.apache.spark.streaming.dstream.DStream
storageLevel() - Method in class org.apache.spark.streaming.receiver.Receiver
storageLevelCache() - Static method in class org.apache.spark.storage.StorageLevel: :: DeveloperApi :: Read StorageLevel object from ObjectInput stream.
StorageLevels - Class in org.apache.spark.api.java: Expose some commonly useful storage level constants.
StorageLevels() - Constructor for class org.apache.spark.api.java.StorageLevels
StorageListener - Class in org.apache.spark.ui.storage: :: DeveloperApi :: A SparkListener that prepares information to be displayed on the BlockManagerUI.
StorageListener(StorageStatusListener) - Constructor for class org.apache.spark.ui.storage.StorageListener
StorageStatus - Class in org.apache.spark.storage: :: DeveloperApi :: Storage information for each BlockManager.
StorageStatus(BlockManagerId, long) - Constructor for class org.apache.spark.storage.StorageStatus
StorageStatus(BlockManagerId, long, Map<BlockId, BlockStatus>) - Constructor for class org.apache.spark.storage.StorageStatus: Create a storage status with an initial set of blocks, leaving the source unmodified.
storageStatusList() - Method in class org.apache.spark.storage.StorageStatusListener
storageStatusList() - Method in class org.apache.spark.ui.exec.ExecutorsListener
storageStatusList() - Method in class org.apache.spark.ui.storage.StorageListener
StorageStatusListener - Class in org.apache.spark.storage: :: DeveloperApi :: A SparkListener that maintains executor storage status.
StorageStatusListener() - Constructor for class org.apache.spark.storage.StorageStatusListener
store(Iterator<T>) - Method in interface org.apache.spark.streaming.receiver.ActorHelper: Store an iterator of received data as a data block into Spark's memory.
store(ByteBuffer) - Method in interface org.apache.spark.streaming.receiver.ActorHelper: Store the bytes of received data as a data block into Spark's memory.
store(T) - Method in interface org.apache.spark.streaming.receiver.ActorHelper: Store a single item of received data to Spark's memory.
store(T) - Method in class org.apache.spark.streaming.receiver.Receiver: Store a single item of received data to Spark's memory.
store(ArrayBuffer<T>) - Method in class org.apache.spark.streaming.receiver.Receiver: Store an ArrayBuffer of received data as a data block into Spark's memory.
store(ArrayBuffer<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver: Store an ArrayBuffer of received data as a data block into Spark's memory.
store(Iterator<T>) - Method in class org.apache.spark.streaming.receiver.Receiver: Store an iterator of received data as a data block into Spark's memory.
store(Iterator<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver: Store an iterator of received data as a data block into Spark's memory.
store(Iterator<T>) - Method in class org.apache.spark.streaming.receiver.Receiver: Store an iterator of received data as a data block into Spark's memory.
store(Iterator<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver: Store an iterator of received data as a data block into Spark's memory.
store(ByteBuffer) - Method in class org.apache.spark.streaming.receiver.Receiver: Store the bytes of received data as a data block into Spark's memory.
store(ByteBuffer, Object) - Method in class org.apache.spark.streaming.receiver.Receiver: Store the bytes of received data as a data block into Spark's memory.
Strategy - Class in org.apache.spark.mllib.tree.configuration: Stores all the configuration options for tree construction param: algo Learning goal.
Strategy(Enumeration.Value, Impurity, int, int, int, Enumeration.Value, Map<Object, Object>, int, double, int, double, boolean, int) - Constructor for class org.apache.spark.mllib.tree.configuration.Strategy
Strategy(Enumeration.Value, Impurity, int, int, int, Map<Integer, Integer>) - Constructor for class org.apache.spark.mllib.tree.configuration.Strategy: Java-friendly constructor for Strategy
STREAM() - Static method in class org.apache.spark.storage.BlockId
StreamBlockId - Class in org.apache.spark.storage
StreamBlockId(int, long) - Constructor for class org.apache.spark.storage.StreamBlockId
streamId() - Method in class org.apache.spark.storage.StreamBlockId
streamId() - Method in class org.apache.spark.streaming.receiver.Receiver: Get the unique identifier the receiver input stream that this receiver is associated with.
streamId() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
streamIdToInputInfo() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
streamIdToNumRecords() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
StreamingContext - Class in org.apache.spark.streaming: Main entry point for Spark Streaming functionality.
StreamingContext(SparkContext, Duration) - Constructor for class org.apache.spark.streaming.StreamingContext: Create a StreamingContext using an existing SparkContext.
StreamingContext(SparkConf, Duration) - Constructor for class org.apache.spark.streaming.StreamingContext: Create a StreamingContext by providing the configuration necessary for a new SparkContext.
StreamingContext(String, String, Duration, String, Seq<String>, Map<String, String>) - Constructor for class org.apache.spark.streaming.StreamingContext: Create a StreamingContext by providing the details necessary for creating a new SparkContext.
StreamingContext(String, Configuration) - Constructor for class org.apache.spark.streaming.StreamingContext: Recreate a StreamingContext from a checkpoint file.
StreamingContext(String) - Constructor for class org.apache.spark.streaming.StreamingContext: Recreate a StreamingContext from a checkpoint file.
StreamingContext(String, SparkContext) - Constructor for class org.apache.spark.streaming.StreamingContext: Recreate a StreamingContext from a checkpoint file using an existing SparkContext.
StreamingContextState - Enum in org.apache.spark.streaming: :: DeveloperApi :: Represents the state of a StreamingContext.
StreamingKMeans - Class in org.apache.spark.mllib.clustering: StreamingKMeans provides methods for configuring a streaming k-means analysis, training the model on streaming, and using the model to make predictions on streaming data.
StreamingKMeans(int, double, String) - Constructor for class org.apache.spark.mllib.clustering.StreamingKMeans
StreamingKMeans() - Constructor for class org.apache.spark.mllib.clustering.StreamingKMeans
StreamingKMeansModel - Class in org.apache.spark.mllib.clustering: StreamingKMeansModel extends MLlib's KMeansModel for streaming algorithms, so it can keep track of a continuously updated weight associated with each cluster, and also update the model by doing a single iteration of the standard k-means algorithm.
StreamingKMeansModel(Vector[], double[]) - Constructor for class org.apache.spark.mllib.clustering.StreamingKMeansModel
StreamingLinearAlgorithm<M extends GeneralizedLinearModel,A extends GeneralizedLinearAlgorithm<M>> - Class in org.apache.spark.mllib.regression: :: DeveloperApi :: StreamingLinearAlgorithm implements methods for continuously training a generalized linear model model on streaming data, and using it for prediction on (possibly different) streaming data.
StreamingLinearAlgorithm() - Constructor for class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
StreamingLinearRegressionWithSGD - Class in org.apache.spark.mllib.regression: Train or predict a linear regression model on streaming data.
StreamingLinearRegressionWithSGD() - Constructor for class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD: Construct a StreamingLinearRegression object with default parameters: {stepSize: 0.1, numIterations: 50, miniBatchFraction: 1.0}.
StreamingListener - Interface in org.apache.spark.streaming.scheduler: :: DeveloperApi :: A listener interface for receiving information about an ongoing streaming computation.
StreamingListenerBatchCompleted - Class in org.apache.spark.streaming.scheduler
StreamingListenerBatchCompleted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchCompleted
StreamingListenerBatchStarted - Class in org.apache.spark.streaming.scheduler
StreamingListenerBatchStarted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchStarted
StreamingListenerBatchSubmitted - Class in org.apache.spark.streaming.scheduler
StreamingListenerBatchSubmitted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchSubmitted
StreamingListenerEvent - Interface in org.apache.spark.streaming.scheduler: :: DeveloperApi :: Base trait for events related to StreamingListener
StreamingListenerOutputOperationCompleted - Class in org.apache.spark.streaming.scheduler
StreamingListenerOutputOperationCompleted(OutputOperationInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerOutputOperationCompleted
StreamingListenerOutputOperationStarted - Class in org.apache.spark.streaming.scheduler
StreamingListenerOutputOperationStarted(OutputOperationInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerOutputOperationStarted
StreamingListenerReceiverError - Class in org.apache.spark.streaming.scheduler
StreamingListenerReceiverError(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverError
StreamingListenerReceiverStarted - Class in org.apache.spark.streaming.scheduler
StreamingListenerReceiverStarted(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStarted
StreamingListenerReceiverStopped - Class in org.apache.spark.streaming.scheduler
StreamingListenerReceiverStopped(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStopped
StreamingLogisticRegressionWithSGD - Class in org.apache.spark.mllib.classification: Train or predict a logistic regression model on streaming data.
StreamingLogisticRegressionWithSGD() - Constructor for class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD: Construct a StreamingLogisticRegression object with default parameters: {stepSize: 0.1, numIterations: 50, miniBatchFraction: 1.0, regParam: 0.0}.
StreamingTest - Class in org.apache.spark.mllib.stat.test
StreamingTest() - Constructor for class org.apache.spark.mllib.stat.test.StreamingTest
StreamInputInfo - Class in org.apache.spark.streaming.scheduler: :: DeveloperApi :: Track the information of input stream at specified batch time.
StreamInputInfo(int, long, Map<String, Object>) - Constructor for class org.apache.spark.streaming.scheduler.StreamInputInfo
string() - Method in class org.apache.spark.sql.ColumnName: Creates a new StructField of type string.
STRING() - Static method in class org.apache.spark.sql.Encoders: An encoder for nullable string type.
StringArrayParam - Class in org.apache.spark.ml.param: :: DeveloperApi :: Specialized version of Param[Array[String} for Java.
StringArrayParam(Params, String, String, Function1<String[], Object>) - Constructor for class org.apache.spark.ml.param.StringArrayParam
StringArrayParam(Params, String, String) - Constructor for class org.apache.spark.ml.param.StringArrayParam
StringContains - Class in org.apache.spark.sql.sources: A filter that evaluates to true iff the attribute evaluates to a string that contains the string value.
StringContains(String, String) - Constructor for class org.apache.spark.sql.sources.StringContains
StringEndsWith - Class in org.apache.spark.sql.sources: A filter that evaluates to true iff the attribute evaluates to a string that starts with value.
StringEndsWith(String, String) - Constructor for class org.apache.spark.sql.sources.StringEndsWith
StringIndexer - Class in org.apache.spark.ml.feature: :: Experimental :: A label indexer that maps a string column of labels to an ML column of label indices.
StringIndexer(String) - Constructor for class org.apache.spark.ml.feature.StringIndexer
StringIndexer() - Constructor for class org.apache.spark.ml.feature.StringIndexer
StringIndexerModel - Class in org.apache.spark.ml.feature: :: Experimental :: Model fitted by StringIndexer.
StringIndexerModel(String, String[]) - Constructor for class org.apache.spark.ml.feature.StringIndexerModel
StringIndexerModel(String[]) - Constructor for class org.apache.spark.ml.feature.StringIndexerModel
stringRddToDataFrameHolder(RDD<String>) - Method in class org.apache.spark.sql.SQLImplicits: Creates a single column DataFrame from an RDD[String].
stringResult() - Method in class org.apache.spark.sql.hive.HiveContext.QueryExecution: Returns the result as a hive compatible sequence of strings.
StringRRDD<T> - Class in org.apache.spark.api.r: An RDD that stores R objects as Array[String].
StringRRDD(RDD<T>, byte[], String, byte[], Object[], ClassTag<T>) - Constructor for class org.apache.spark.api.r.StringRRDD
StringStartsWith - Class in org.apache.spark.sql.sources: A filter that evaluates to true iff the attribute evaluates to a string that starts with value.
StringStartsWith(String, String) - Constructor for class org.apache.spark.sql.sources.StringStartsWith
stringToText(String) - Static method in class org.apache.spark.SparkContext
StringType - Static variable in class org.apache.spark.sql.types.DataTypes: Gets the StringType object.
StringType - Class in org.apache.spark.sql.types: :: DeveloperApi :: The data type representing String values.
stringWritableConverter() - Static method in class org.apache.spark.SparkContext
stronglyConnectedComponents(int) - Method in class org.apache.spark.graphx.GraphOps: Compute the strongly connected component (SCC) of each vertex and return a graph with the vertex value containing the lowest vertex id in the SCC containing that vertex.
StronglyConnectedComponents - Class in org.apache.spark.graphx.lib: Strongly connected components algorithm implementation.
StronglyConnectedComponents() - Constructor for class org.apache.spark.graphx.lib.StronglyConnectedComponents
struct(Seq<StructField>) - Method in class org.apache.spark.sql.ColumnName: Creates a new StructField of type struct.
struct(StructType) - Method in class org.apache.spark.sql.ColumnName: Creates a new StructField of type struct.
struct(Column...) - Static method in class org.apache.spark.sql.functions: Creates a new struct column.
struct(String, String...) - Static method in class org.apache.spark.sql.functions: Creates a new struct column that composes multiple input columns.
struct(Seq<Column>) - Static method in class org.apache.spark.sql.functions: Creates a new struct column.
struct(String, Seq<String>) - Static method in class org.apache.spark.sql.functions: Creates a new struct column that composes multiple input columns.
StructField - Class in org.apache.spark.sql.types: A field inside a StructType.
StructField(String, DataType, boolean, Metadata) - Constructor for class org.apache.spark.sql.types.StructField
StructField() - Constructor for class org.apache.spark.sql.types.StructField: No-arg constructor for kryo.
StructType - Class in org.apache.spark.sql.types: :: DeveloperApi :: A StructType object can be constructed by
StructType(StructField[]) - Constructor for class org.apache.spark.sql.types.StructType
StructType() - Constructor for class org.apache.spark.sql.types.StructType: No-arg constructor for kryo.
subgraph(Function1<EdgeTriplet<VD, ED>, Object>, Function2<Object, VD, Object>) - Method in class org.apache.spark.graphx.Graph: Restricts the graph to only the vertices and edges satisfying the predicates.
subgraph(Function1<EdgeTriplet<VD, ED>, Object>, Function2<Object, VD, Object>) - Method in class org.apache.spark.graphx.impl.GraphImpl
submissionTime() - Method in class org.apache.spark.scheduler.StageInfo: When this stage was submitted from the DAGScheduler to a TaskScheduler.
submissionTime() - Method in interface org.apache.spark.SparkStageInfo
submissionTime() - Method in class org.apache.spark.SparkStageInfoImpl
submissionTime() - Method in class org.apache.spark.status.api.v1.JobData
submissionTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
submitJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, Function2<Object, U, BoxedUnit>, Function0<R>) - Method in class org.apache.spark.SparkContext: Submit a job for execution and return a FutureJob holding the result.
subsamplingRate() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
subsetAccuracy() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics: Returns subset accuracy (for equal sets of labels)
substitutor() - Method in class org.apache.spark.sql.hive.HiveContext
substr(Column, Column) - Method in class org.apache.spark.sql.Column: An expression that returns a substring.
substr(int, int) - Method in class org.apache.spark.sql.Column: An expression that returns a substring.
substring(Column, int, int) - Static method in class org.apache.spark.sql.functions: Substring starts at pos and is of length len when str is String type or returns the slice of byte array that starts at pos in byte and is of length len when str is Binary type
substring_index(Column, String, int) - Static method in class org.apache.spark.sql.functions: Returns the substring from string str before count occurrences of the delimiter delim.
subtract(JavaDoubleRDD) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return an RDD with the elements from this that are not in other.
subtract(JavaDoubleRDD, int) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return an RDD with the elements from this that are not in other.
subtract(JavaDoubleRDD, Partitioner) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return an RDD with the elements from this that are not in other.
subtract(JavaPairRDD<K, V>) - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD with the elements from this that are not in other.
subtract(JavaPairRDD<K, V>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD with the elements from this that are not in other.
subtract(JavaPairRDD<K, V>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD with the elements from this that are not in other.
subtract(JavaRDD<T>) - Method in class org.apache.spark.api.java.JavaRDD: Return an RDD with the elements from this that are not in other.
subtract(JavaRDD<T>, int) - Method in class org.apache.spark.api.java.JavaRDD: Return an RDD with the elements from this that are not in other.
subtract(JavaRDD<T>, Partitioner) - Method in class org.apache.spark.api.java.JavaRDD: Return an RDD with the elements from this that are not in other.
subtract(RDD<T>) - Method in class org.apache.spark.rdd.RDD: Return an RDD with the elements from this that are not in other.
subtract(RDD<T>, int) - Method in class org.apache.spark.rdd.RDD: Return an RDD with the elements from this that are not in other.
subtract(RDD<T>, Partitioner, Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Return an RDD with the elements from this that are not in other.
subtract(Dataset<T>) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset where any elements present in other have been removed.
subtract(Vector) - Method in class org.apache.spark.util.Vector
subtractByKey(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD with the pairs from this whose keys are not in other.
subtractByKey(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD with the pairs from `this` whose keys are not in `other`.
subtractByKey(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD with the pairs from `this` whose keys are not in `other`.
subtractByKey(RDD<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return an RDD with the pairs from this whose keys are not in other.
subtractByKey(RDD<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return an RDD with the pairs from `this` whose keys are not in `other`.
subtractByKey(RDD<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions: Return an RDD with the pairs from `this` whose keys are not in `other`.
succeededTasks() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
Success - Class in org.apache.spark: :: DeveloperApi :: Task succeeded.
Success() - Constructor for class org.apache.spark.Success
successful() - Method in class org.apache.spark.scheduler.TaskInfo
sum() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Add up the elements in this RDD.
sum() - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Add up the elements in this RDD.
sum(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the sum of all values in the expression.
sum(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the sum of all values in the given column.
sum(String...) - Method in class org.apache.spark.sql.GroupedData: Compute the sum for each numeric columns for each group.
sum(Seq<String>) - Method in class org.apache.spark.sql.GroupedData: Compute the sum for each numeric columns for each group.
sum() - Method in class org.apache.spark.util.StatCounter
sum() - Method in class org.apache.spark.util.Vector
sumApprox(long, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Approximate operation to return the sum within a timeout.
sumApprox(long) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Approximate operation to return the sum within a timeout.
sumApprox(long, double) - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Approximate operation to return the sum within a timeout.
sumDistinct(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the sum of distinct values in the expression.
sumDistinct(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the sum of distinct values in the expression.
summary() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel: Gets summary of model on training set.
summary() - Method in class org.apache.spark.ml.regression.LinearRegressionModel: Gets summary (e.g.
supportedFeatureSubsetStrategies() - Static method in class org.apache.spark.ml.classification.RandomForestClassifier: Accessor for supported featureSubsetStrategy settings: auto, all, onethird, sqrt, log2
supportedFeatureSubsetStrategies() - Static method in class org.apache.spark.ml.regression.RandomForestRegressor: Accessor for supported featureSubsetStrategy settings: auto, all, onethird, sqrt, log2
supportedFeatureSubsetStrategies() - Static method in class org.apache.spark.mllib.tree.RandomForest: List of supported feature subset sampling strategies.
supportedImpurities() - Static method in class org.apache.spark.ml.classification.DecisionTreeClassifier: Accessor for supported impurities: entropy, gini
supportedImpurities() - Static method in class org.apache.spark.ml.classification.RandomForestClassifier: Accessor for supported impurity settings: entropy, gini
supportedImpurities() - Static method in class org.apache.spark.ml.regression.DecisionTreeRegressor: Accessor for supported impurities: variance
supportedImpurities() - Static method in class org.apache.spark.ml.regression.RandomForestRegressor: Accessor for supported impurity settings: variance
supportedLossTypes() - Static method in class org.apache.spark.ml.classification.GBTClassifier: Accessor for supported loss settings: logistic
supportedLossTypes() - Static method in class org.apache.spark.ml.regression.GBTRegressor: Accessor for supported loss settings: squared (L2), absolute (L1)
supportedModelTypes() - Static method in class org.apache.spark.mllib.classification.NaiveBayes
supportsRelocationOfSerializedObjects() - Method in class org.apache.spark.serializer.KryoSerializer
SVDPlusPlus - Class in org.apache.spark.graphx.lib: Implementation of SVD++ algorithm.
SVDPlusPlus() - Constructor for class org.apache.spark.graphx.lib.SVDPlusPlus
SVDPlusPlus.Conf - Class in org.apache.spark.graphx.lib: Configuration parameters for SVDPlusPlus.
SVDPlusPlus.Conf(int, int, double, double, double, double, double, double) - Constructor for class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
SVMDataGenerator - Class in org.apache.spark.mllib.util: :: DeveloperApi :: Generate sample data used for SVM.
SVMDataGenerator() - Constructor for class org.apache.spark.mllib.util.SVMDataGenerator
SVMModel - Class in org.apache.spark.mllib.classification: Model for Support Vector Machines (SVMs).
SVMModel(Vector, double) - Constructor for class org.apache.spark.mllib.classification.SVMModel
SVMWithSGD - Class in org.apache.spark.mllib.classification: Train a Support Vector Machine (SVM) using Stochastic Gradient Descent.
SVMWithSGD() - Constructor for class org.apache.spark.mllib.classification.SVMWithSGD: Construct a SVM object with default parameters: {stepSize: 1.0, numIterations: 100, regParm: 0.01, miniBatchFraction: 1.0}.
symbolToColumn(Symbol) - Method in class org.apache.spark.sql.SQLImplicits: An implicit conversion that turns a Scala Symbol into a Column.
SYSTEM_DEFAULT() - Static method in class org.apache.spark.sql.types.DecimalType
systemProperties() - Method in class org.apache.spark.ui.env.EnvironmentListener

T

t() - Method in class org.apache.spark.SerializableWritable
table(String) - Method in class org.apache.spark.sql.DataFrameReader: Returns the specified table as a DataFrame.
table(String) - Method in class org.apache.spark.sql.SQLContext
tableNames() - Method in class org.apache.spark.sql.SQLContext
tableNames(String) - Method in class org.apache.spark.sql.SQLContext
tables() - Method in class org.apache.spark.sql.SQLContext
tables(String) - Method in class org.apache.spark.sql.SQLContext
TableScan - Interface in org.apache.spark.sql.sources: ::DeveloperApi:: A BaseRelation that can produce all of its tuples as an RDD of Row objects.
tachyonFolderName() - Method in class org.apache.spark.SparkContext
tag() - Method in class org.apache.spark.sql.types.BinaryType
tag() - Method in class org.apache.spark.sql.types.BooleanType
tag() - Method in class org.apache.spark.sql.types.ByteType
tag() - Method in class org.apache.spark.sql.types.DateType
tag() - Method in class org.apache.spark.sql.types.DecimalType
tag() - Method in class org.apache.spark.sql.types.DoubleType
tag() - Method in class org.apache.spark.sql.types.FloatType
tag() - Method in class org.apache.spark.sql.types.IntegerType
tag() - Method in class org.apache.spark.sql.types.LongType
tag() - Method in class org.apache.spark.sql.types.ShortType
tag() - Method in class org.apache.spark.sql.types.StringType
tag() - Method in class org.apache.spark.sql.types.TimestampType
take(int) - Method in interface org.apache.spark.api.java.JavaRDDLike: Take the first num elements of the RDD.
take(int) - Method in class org.apache.spark.rdd.RDD: Take the first num elements of the RDD.
take(int) - Method in class org.apache.spark.sql.DataFrame: Returns the first n rows in the DataFrame.
take(int) - Method in class org.apache.spark.sql.Dataset: Returns the first num elements of this Dataset as an array.
takeAsList(int) - Method in class org.apache.spark.sql.DataFrame: Returns the first n rows in the DataFrame as a list.
takeAsList(int) - Method in class org.apache.spark.sql.Dataset: Returns the first num elements of this Dataset as an array.
takeAsync(int) - Method in interface org.apache.spark.api.java.JavaRDDLike: The asynchronous version of the take action, which returns a future for retrieving the first num elements of this RDD.
takeAsync(int) - Method in class org.apache.spark.rdd.AsyncRDDActions: Returns a future for retrieving the first num elements of the RDD.
takeOrdered(int, Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Returns the first k (smallest) elements from this RDD as defined by the specified Comparator[T] and maintains the order.
takeOrdered(int) - Method in interface org.apache.spark.api.java.JavaRDDLike: Returns the first k (smallest) elements from this RDD using the natural ordering for T while maintain the order.
takeOrdered(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD: Returns the first k (smallest) elements from this RDD as defined by the specified implicit Ordering[T] and maintains the ordering.
takeSample(boolean, int) - Method in interface org.apache.spark.api.java.JavaRDDLike
takeSample(boolean, int, long) - Method in interface org.apache.spark.api.java.JavaRDDLike
takeSample(boolean, int, long) - Method in class org.apache.spark.rdd.RDD: Return a fixed-size sampled subset of this RDD in an array
tallSkinnyQR(boolean) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix: Compute QR decomposition for RowMatrix.
tan(Column) - Static method in class org.apache.spark.sql.functions: Computes the tangent of the given value.
tan(String) - Static method in class org.apache.spark.sql.functions: Computes the tangent of the given column.
tanh(Column) - Static method in class org.apache.spark.sql.functions: Computes the hyperbolic tangent of the given value.
tanh(String) - Static method in class org.apache.spark.sql.functions: Computes the hyperbolic tangent of the given column.
targetStorageLevel() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
targetStorageLevel() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
task() - Method in class org.apache.spark.CleanupTaskWeakReference
taskAttemptId() - Method in class org.apache.spark.TaskContext: An ID that is unique to this task attempt (within the same SparkContext, no two task attempts will share the same attempt ID).
TaskCommitDenied - Class in org.apache.spark: :: DeveloperApi :: Task requested the driver to commit, but was denied.
TaskCommitDenied(int, int, int) - Constructor for class org.apache.spark.TaskCommitDenied
TaskCompletionListener - Interface in org.apache.spark.util: :: DeveloperApi ::
TaskContext - Class in org.apache.spark: Contextual information about a task which can be read or mutated during execution.
TaskContext() - Constructor for class org.apache.spark.TaskContext
TaskData - Class in org.apache.spark.status.api.v1
TaskEndReason - Interface in org.apache.spark: :: DeveloperApi :: Various possible reasons why a task ended.
TaskFailedReason - Interface in org.apache.spark: :: DeveloperApi :: Various possible reasons why a task failed.
taskId() - Method in class org.apache.spark.scheduler.local.KillTask
taskId() - Method in class org.apache.spark.scheduler.local.StatusUpdate
taskId() - Method in class org.apache.spark.scheduler.TaskInfo
taskId() - Method in class org.apache.spark.status.api.v1.TaskData
taskId() - Method in class org.apache.spark.storage.TaskResultBlockId
taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskGettingResult
taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskStart
TaskInfo - Class in org.apache.spark.scheduler: :: DeveloperApi :: Information about a running task attempt inside a TaskSet.
TaskInfo(long, int, int, long, String, String, Enumeration.Value, boolean) - Constructor for class org.apache.spark.scheduler.TaskInfo
TaskKilled - Class in org.apache.spark: :: DeveloperApi :: Task was killed intentionally and needs to be rescheduled.
TaskKilled() - Constructor for class org.apache.spark.TaskKilled
TaskKilledException - Exception in org.apache.spark: :: DeveloperApi :: Exception thrown when a task is explicitly killed (i.e., task failure is expected).
TaskKilledException() - Constructor for exception org.apache.spark.TaskKilledException
taskLocality() - Method in class org.apache.spark.scheduler.TaskInfo
TaskLocality - Class in org.apache.spark.scheduler
TaskLocality() - Constructor for class org.apache.spark.scheduler.TaskLocality
taskLocality() - Method in class org.apache.spark.status.api.v1.TaskData
taskLocalityPreferences() - Method in class org.apache.spark.scheduler.StageInfo
TaskMetricDistributions - Class in org.apache.spark.status.api.v1
taskMetrics() - Method in class org.apache.spark.scheduler.SparkListenerExecutorMetricsUpdate
taskMetrics() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
taskMetrics() - Method in class org.apache.spark.status.api.v1.TaskData
TaskMetrics - Class in org.apache.spark.status.api.v1
taskMetrics() - Method in class org.apache.spark.TaskContext: ::DeveloperApi::
TASKRESULT() - Static method in class org.apache.spark.storage.BlockId
TaskResultBlockId - Class in org.apache.spark.storage
TaskResultBlockId(long) - Constructor for class org.apache.spark.storage.TaskResultBlockId
TaskResultLost - Class in org.apache.spark: :: DeveloperApi :: The task finished successfully, but the result was lost from the executor's block manager before it was fetched.
TaskResultLost() - Constructor for class org.apache.spark.TaskResultLost
tasks() - Method in class org.apache.spark.status.api.v1.StageData
TaskSorting - Enum in org.apache.spark.status.api.v1
taskTime() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
taskType() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
TEST() - Static method in class org.apache.spark.storage.BlockId
TestResult<DF> - Interface in org.apache.spark.mllib.stat.test: Trait for hypothesis test results.
text(String...) - Method in class org.apache.spark.sql.DataFrameReader: Loads a text file and returns a DataFrame with a single string column named "value".
text(Seq<String>) - Method in class org.apache.spark.sql.DataFrameReader: Loads a text file and returns a DataFrame with a single string column named "value".
text(String) - Method in class org.apache.spark.sql.DataFrameWriter: Saves the content of the DataFrame in a text file at the specified path.
textFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Read a text file from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI, and return it as an RDD of Strings.
textFile(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Read a text file from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI, and return it as an RDD of Strings.
textFile(String, int) - Method in class org.apache.spark.SparkContext: Read a text file from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI, and return it as an RDD of Strings.
textFileStream(String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create an input stream that monitors a Hadoop-compatible filesystem for new files and reads them as text files (using key as LongWritable, value as Text and input format as TextInputFormat).
textFileStream(String) - Method in class org.apache.spark.streaming.StreamingContext: Create a input stream that monitors a Hadoop-compatible filesystem for new files and reads them as text files (using key as LongWritable, value as Text and input format as TextInputFormat).
theta() - Method in class org.apache.spark.ml.classification.NaiveBayesModel
theta() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
threshold() - Method in class org.apache.spark.ml.feature.Binarizer: Param for threshold used to binarize continuous features.
threshold() - Method in class org.apache.spark.ml.tree.ContinuousSplit
threshold() - Method in class org.apache.spark.mllib.tree.model.Split
thresholds() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Returns thresholds in descending order.
throwBalls() - Method in class org.apache.spark.rdd.PartitionCoalescer
time() - Method in class org.apache.spark.scheduler.SparkListenerApplicationEnd
time() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
time() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
time() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerRemoved
time() - Method in class org.apache.spark.scheduler.SparkListenerExecutorAdded
time() - Method in class org.apache.spark.scheduler.SparkListenerExecutorRemoved
time() - Method in class org.apache.spark.scheduler.SparkListenerJobEnd
time() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
Time - Class in org.apache.spark.streaming: This is a simple class that represents an absolute instant of time.
Time(long) - Constructor for class org.apache.spark.streaming.Time
timeout(Duration) - Method in class org.apache.spark.streaming.StateSpec: Set the duration after which the state of an idle key will be removed.
times(int) - Method in class org.apache.spark.streaming.Duration
timestamp() - Method in class org.apache.spark.sql.ColumnName: Creates a new StructField of type timestamp.
TIMESTAMP() - Static method in class org.apache.spark.sql.Encoders: An encoder for nullable timestamp type.
TimestampType - Static variable in class org.apache.spark.sql.types.DataTypes: Gets the TimestampType object.
TimestampType - Class in org.apache.spark.sql.types: :: DeveloperApi :: The data type representing java.sql.Timestamp values.
TimeTrackingOutputStream - Class in org.apache.spark.storage: Intercepts write calls and tracks total time spent writing in order to update shuffle write metrics.
TimeTrackingOutputStream(ShuffleWriteMetrics, OutputStream) - Constructor for class org.apache.spark.storage.TimeTrackingOutputStream
timeUnit() - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
TIMING_DATA() - Static method in class org.apache.spark.api.r.SpecialLengths
to(Time, Duration) - Method in class org.apache.spark.streaming.Time
to_date(Column) - Static method in class org.apache.spark.sql.functions: Converts the column into DateType.
to_utc_timestamp(Column, String) - Static method in class org.apache.spark.sql.functions: Assumes given timestamp is in given timezone and converts to UTC.
toArray() - Method in interface org.apache.spark.api.java.JavaRDDLike: Deprecated.
As of Spark 1.0.0, toArray() is deprecated, use JavaRDDLike.collect() instead
toArray() - Method in class org.apache.spark.input.PortableDataStream: Read the file as a byte array
toArray() - Method in class org.apache.spark.mllib.linalg.DenseVector
toArray() - Method in interface org.apache.spark.mllib.linalg.Matrix: Converts to a dense array in column major.
toArray() - Method in class org.apache.spark.mllib.linalg.SparseVector
toArray() - Method in interface org.apache.spark.mllib.linalg.Vector: Converts the instance to a double array.
toArray() - Method in class org.apache.spark.rdd.RDD: Return an array that contains all of the elements in this RDD.
toAttributes() - Method in class org.apache.spark.sql.types.StructType
toBigDecimal() - Method in class org.apache.spark.sql.types.Decimal
toBlockMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
toBlockMatrix(int, int) - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
toBlockMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
toBlockMatrix(int, int) - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
toBreeze() - Method in interface org.apache.spark.mllib.linalg.distributed.DistributedMatrix: Collects data and assembles a local dense breeze matrix (for test only).
toBreeze() - Method in interface org.apache.spark.mllib.linalg.Matrix: Converts to a breeze matrix.
toBreeze() - Method in interface org.apache.spark.mllib.linalg.Vector: Converts the instance to a breeze vector.
toByte() - Method in class org.apache.spark.sql.types.Decimal
toColumn(Encoder, Encoder<O>) - Method in class org.apache.spark.sql.expressions.Aggregator: Returns this Aggregator as a TypedColumn that can be used in Dataset or DataFrame operations.
toCoordinateMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix: Converts to CoordinateMatrix.
toCoordinateMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
toDebugString() - Method in interface org.apache.spark.api.java.JavaRDDLike: A description of this RDD and its recursive dependencies for debugging.
toDebugString() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel: Print the full model to a string.
toDebugString() - Method in class org.apache.spark.rdd.RDD: A description of this RDD and its recursive dependencies for debugging.
toDebugString() - Method in class org.apache.spark.SparkConf: Return a string listing all keys and values, one per line.
toDebugString() - Method in class org.apache.spark.sql.types.Decimal
toDegrees(Column) - Static method in class org.apache.spark.sql.functions: Converts an angle measured in radians to an approximately equivalent angle measured in degrees.
toDegrees(String) - Static method in class org.apache.spark.sql.functions: Converts an angle measured in radians to an approximately equivalent angle measured in degrees.
toDense() - Method in class org.apache.spark.mllib.linalg.SparseMatrix: Generate a DenseMatrix from the given SparseMatrix.
toDense() - Method in interface org.apache.spark.mllib.linalg.Vector: Converts this vector to a dense vector.
toDF(String...) - Method in class org.apache.spark.sql.DataFrame: Returns a new DataFrame with columns renamed.
toDF() - Method in class org.apache.spark.sql.DataFrame: Returns the object itself.
toDF(Seq<String>) - Method in class org.apache.spark.sql.DataFrame: Returns a new DataFrame with columns renamed.
toDF() - Method in class org.apache.spark.sql.DataFrameHolder
toDF(Seq<String>) - Method in class org.apache.spark.sql.DataFrameHolder
toDF() - Method in class org.apache.spark.sql.Dataset: Converts this strongly typed collection of data to generic Dataframe.
toDouble() - Method in class org.apache.spark.sql.types.Decimal
toDS() - Method in class org.apache.spark.sql.Dataset: Returns this Dataset.
toDS() - Method in class org.apache.spark.sql.DatasetHolder
toEdgeTriplet() - Method in class org.apache.spark.graphx.EdgeContext: Converts the edge and vertex properties into an EdgeTriplet for convenience.
toErrorString() - Method in class org.apache.spark.ExceptionFailure
toErrorString() - Method in class org.apache.spark.ExecutorLostFailure
toErrorString() - Method in class org.apache.spark.FetchFailed
toErrorString() - Static method in class org.apache.spark.Resubmitted
toErrorString() - Method in class org.apache.spark.TaskCommitDenied
toErrorString() - Method in interface org.apache.spark.TaskFailedReason: Error message displayed in the web UI.
toErrorString() - Static method in class org.apache.spark.TaskKilled
toErrorString() - Static method in class org.apache.spark.TaskResultLost
toErrorString() - Static method in class org.apache.spark.UnknownReason
toFloat() - Method in class org.apache.spark.sql.types.Decimal
toFormattedString() - Method in class org.apache.spark.streaming.Duration
toHiveString(Tuple2<Object, DataType>) - Static method in class org.apache.spark.sql.hive.HiveContext
toHiveStructString(Tuple2<Object, DataType>) - Static method in class org.apache.spark.sql.hive.HiveContext: Hive outputs fields of structs slightly differently than top level attributes.
toIndexedRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix: Converts to IndexedRowMatrix.
toIndexedRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
toInt() - Method in class org.apache.spark.sql.types.Decimal
toInt() - Method in class org.apache.spark.storage.StorageLevel
toJavaBigDecimal() - Method in class org.apache.spark.sql.types.Decimal
toJavaDStream() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Convert to a JavaDStream
toJavaRDD() - Method in class org.apache.spark.rdd.RDD
toJavaRDD() - Method in class org.apache.spark.sql.DataFrame: Returns the content of the DataFrame as a JavaRDD of Rows.
toJson() - Method in class org.apache.spark.mllib.linalg.DenseVector
toJson() - Method in class org.apache.spark.mllib.linalg.SparseVector
toJson() - Method in interface org.apache.spark.mllib.linalg.Vector: Converts the vector to a JSON string.
toJSON() - Method in class org.apache.spark.sql.DataFrame: Returns the content of the DataFrame as a RDD of JSON strings.
Tokenizer - Class in org.apache.spark.ml.feature: :: Experimental :: A tokenizer that converts the input string to lowercase and then splits it by white spaces.
Tokenizer(String) - Constructor for class org.apache.spark.ml.feature.Tokenizer
Tokenizer() - Constructor for class org.apache.spark.ml.feature.Tokenizer
toLocal() - Method in class org.apache.spark.ml.clustering.DistributedLDAModel: Convert this distributed model to a local representation.
toLocal() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel: Convert model to a local model.
toLocalIterator() - Method in interface org.apache.spark.api.java.JavaRDDLike: Return an iterator that contains all of the elements in this RDD.
toLocalIterator() - Method in class org.apache.spark.rdd.RDD: Return an iterator that contains all of the elements in this RDD.
toLocalMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix: Collect the distributed matrix on the driver as a `DenseMatrix`.
toLong() - Method in class org.apache.spark.sql.types.Decimal
toLowercase() - Method in class org.apache.spark.ml.feature.RegexTokenizer: Indicates whether to convert all characters to lowercase before tokenizing.
toMetadata(Metadata) - Method in class org.apache.spark.ml.attribute.Attribute: Converts to ML metadata with some existing metadata.
toMetadata() - Method in class org.apache.spark.ml.attribute.Attribute: Converts to ML metadata
toMetadata(Metadata) - Method in class org.apache.spark.ml.attribute.AttributeGroup: Converts to ML metadata with some existing metadata.
toMetadata() - Method in class org.apache.spark.ml.attribute.AttributeGroup: Converts to ML metadata
toOld() - Method in interface org.apache.spark.ml.tree.Split: Convert to old Split format
top(int, Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Returns the top k (largest) elements from this RDD as defined by the specified Comparator[T] and maintains the order.
top(int) - Method in interface org.apache.spark.api.java.JavaRDDLike: Returns the top k (largest) elements from this RDD using the natural ordering for T and maintains the order.
top(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
toPairDStreamFunctions(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Static method in class org.apache.spark.streaming.dstream.DStream
toPairDStreamFunctions(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Static method in class org.apache.spark.streaming.StreamingContext: Deprecated.
As of 1.3.0, replaced by implicit functions in the DStream companion object. This is kept here only for backward compatibility.
topByKey(int, Ordering<V>) - Method in class org.apache.spark.mllib.rdd.MLPairRDDFunctions: Returns the top k (largest) elements for each key from this RDD as defined by the specified implicit Ordering[T].
topDocumentsPerTopic(int) - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel: Return the top documents for each topic
topic() - Method in class org.apache.spark.streaming.kafka.OffsetRange
topicAndPartition() - Method in class org.apache.spark.streaming.kafka.OffsetRange: Kafka TopicAndPartition object, for convenience
topicAssignments() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel: Return the top topic for each (doc, term) pair.
topicConcentration() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
topicConcentration() - Method in class org.apache.spark.mllib.clustering.EMLDAOptimizer
topicConcentration() - Method in class org.apache.spark.mllib.clustering.LDAModel: Concentration parameter (commonly named "beta" or "eta") for the prior placed on topics' distributions over terms.
topicConcentration() - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
topicDistributions() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel: For each document in the training set, return the distribution over topics for that document ("theta_doc").
topicDistributions(RDD<Tuple2<Object, Vector>>) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel: Predicts the topic mixture distribution for each document (often called "theta" in the literature).
topicDistributions(JavaPairRDD<Long, Vector>) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel: Java-friendly version of topicDistributions
topics() - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
topicsMatrix() - Method in class org.apache.spark.ml.clustering.LDAModel: Inferred topics, where each topic is represented by a distribution over terms.
topicsMatrix() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel: Inferred topics, where each topic is represented by a distribution over terms.
topicsMatrix() - Method in class org.apache.spark.mllib.clustering.LDAModel: Inferred topics, where each topic is represented by a distribution over terms.
topicsMatrix() - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
toPMML(StreamResult) - Method in interface org.apache.spark.mllib.pmml.PMMLExportable: Export the model to the stream result in PMML format
toPMML(String) - Method in interface org.apache.spark.mllib.pmml.PMMLExportable: :: Experimental :: Export the model to a local file in PMML format
toPMML(SparkContext, String) - Method in interface org.apache.spark.mllib.pmml.PMMLExportable: :: Experimental :: Export the model to a directory on a distributed file system in PMML format
toPMML(OutputStream) - Method in interface org.apache.spark.mllib.pmml.PMMLExportable: :: Experimental :: Export the model to the OutputStream in PMML format
toPMML() - Method in interface org.apache.spark.mllib.pmml.PMMLExportable: :: Experimental :: Export the model to a String in PMML format
topNode() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
topTopicsPerDocument(int) - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel: For each document, return the top k weighted topics for that document and their weights.
toRadians(Column) - Static method in class org.apache.spark.sql.functions: Converts an angle measured in degrees to an approximately equivalent angle measured in radians.
toRadians(String) - Static method in class org.apache.spark.sql.functions: Converts an angle measured in degrees to an approximately equivalent angle measured in radians.
toRDD(JavaDoubleRDD) - Static method in class org.apache.spark.api.java.JavaDoubleRDD
toRDD(JavaPairRDD<K, V>) - Static method in class org.apache.spark.api.java.JavaPairRDD
toRDD(JavaRDD<T>) - Static method in class org.apache.spark.api.java.JavaRDD
toRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
toRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
TorrentBroadcastFactory - Class in org.apache.spark.broadcast: A Broadcast implementation that uses a BitTorrent-like protocol to do a distributed transfer of the broadcasted data to the executors.
TorrentBroadcastFactory() - Constructor for class org.apache.spark.broadcast.TorrentBroadcastFactory
toSchemaRDD() - Method in class org.apache.spark.sql.DataFrame: Deprecated.
As of 1.3.0, replaced by toDF(). This will be removed in Spark 2.0.
toSeq() - Method in class org.apache.spark.ml.param.ParamMap: Converts this param map to a sequence of param pairs.
toSeq() - Method in interface org.apache.spark.sql.Row: Return a Scala Seq representing the row.
toShort() - Method in class org.apache.spark.sql.types.Decimal
toSparkContext(JavaSparkContext) - Static method in class org.apache.spark.api.java.JavaSparkContext
toSparse() - Method in class org.apache.spark.mllib.linalg.DenseMatrix: Generate a SparseMatrix from the given DenseMatrix.
toSparse() - Method in class org.apache.spark.mllib.linalg.DenseVector
toSparse() - Method in class org.apache.spark.mllib.linalg.SparseVector
toSparse() - Method in interface org.apache.spark.mllib.linalg.Vector: Converts this vector to a sparse vector with all explicit zeros removed.
toSplitInfo(Class<?>, String, InputSplit) - Static method in class org.apache.spark.scheduler.SplitInfo
toSplitInfo(Class<?>, String, InputSplit) - Static method in class org.apache.spark.scheduler.SplitInfo
toString() - Method in class org.apache.spark.Accumulable
toString() - Method in class org.apache.spark.api.java.JavaRDD
toString() - Method in class org.apache.spark.broadcast.Broadcast
toString() - Method in class org.apache.spark.graphx.EdgeDirection
toString() - Method in class org.apache.spark.graphx.EdgeTriplet
toString() - Method in class org.apache.spark.ml.attribute.Attribute
toString() - Method in class org.apache.spark.ml.attribute.AttributeGroup
toString() - Method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
toString() - Method in class org.apache.spark.ml.classification.GBTClassificationModel
toString() - Method in class org.apache.spark.ml.classification.NaiveBayesModel
toString() - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel
toString() - Method in class org.apache.spark.ml.feature.RFormula
toString() - Method in class org.apache.spark.ml.feature.RFormulaModel
toString() - Method in class org.apache.spark.ml.param.Param
toString() - Method in class org.apache.spark.ml.param.ParamMap
toString() - Method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
toString() - Method in class org.apache.spark.ml.regression.GBTRegressionModel
toString() - Method in class org.apache.spark.ml.regression.RandomForestRegressionModel
toString() - Method in class org.apache.spark.ml.tree.InternalNode
toString() - Method in class org.apache.spark.ml.tree.LeafNode
toString() - Method in interface org.apache.spark.ml.util.Identifiable
toString() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
toString() - Method in class org.apache.spark.mllib.classification.SVMModel
toString() - Method in class org.apache.spark.mllib.fpm.AssociationRules.Rule
toString() - Method in class org.apache.spark.mllib.linalg.DenseVector
toString() - Method in interface org.apache.spark.mllib.linalg.Matrix: A human readable representation of the matrix
toString(int, int) - Method in interface org.apache.spark.mllib.linalg.Matrix: A human readable representation of the matrix with maximum lines and width
toString() - Method in class org.apache.spark.mllib.linalg.SparseVector
toString() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel: Print a summary of the model.
toString() - Method in class org.apache.spark.mllib.regression.LabeledPoint
toString() - Method in class org.apache.spark.mllib.stat.test.BinarySample
toString() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
toString() - Method in class org.apache.spark.mllib.stat.test.KolmogorovSmirnovTestResult
toString() - Method in interface org.apache.spark.mllib.stat.test.TestResult: String explaining the hypothesis test result.
toString() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel: Print a summary of the model.
toString() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
toString() - Method in class org.apache.spark.mllib.tree.model.Node
toString() - Method in class org.apache.spark.mllib.tree.model.Predict
toString() - Method in class org.apache.spark.mllib.tree.model.Split
toString() - Method in class org.apache.spark.partial.BoundedDouble
toString() - Method in class org.apache.spark.partial.PartialResult
toString() - Method in class org.apache.spark.rdd.RDD
toString() - Method in class org.apache.spark.scheduler.InputFormatInfo
toString() - Method in class org.apache.spark.scheduler.SplitInfo
toString() - Method in class org.apache.spark.SerializableWritable
toString() - Method in class org.apache.spark.sql.Column
toString() - Method in interface org.apache.spark.sql.Row
toString() - Method in class org.apache.spark.sql.sources.HadoopFsRelation
toString() - Method in class org.apache.spark.sql.types.Decimal
toString() - Method in class org.apache.spark.sql.types.DecimalType
toString() - Method in class org.apache.spark.sql.types.Metadata
toString() - Method in class org.apache.spark.sql.types.StructField
toString() - Method in class org.apache.spark.storage.BlockId
toString() - Method in class org.apache.spark.storage.BlockManagerId
toString() - Method in class org.apache.spark.storage.RDDInfo
toString() - Method in class org.apache.spark.storage.StorageLevel
toString() - Method in class org.apache.spark.streaming.Duration
toString() - Method in class org.apache.spark.streaming.kafka.Broker
toString() - Method in class org.apache.spark.streaming.kafka.OffsetRange
toString() - Method in class org.apache.spark.streaming.State
toString() - Method in class org.apache.spark.streaming.Time
toString() - Method in class org.apache.spark.util.MutablePair
toString() - Method in class org.apache.spark.util.StatCounter
toString() - Method in class org.apache.spark.util.Vector
toStructField(Metadata) - Method in class org.apache.spark.ml.attribute.Attribute: Converts to a StructField with some existing metadata.
toStructField() - Method in class org.apache.spark.ml.attribute.Attribute: Converts to a StructField.
toStructField(Metadata) - Method in class org.apache.spark.ml.attribute.AttributeGroup: Converts to a StructField with some existing metadata.
toStructField() - Method in class org.apache.spark.ml.attribute.AttributeGroup: Converts to a StructField.
totalBlocksFetched() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetricDistributions
totalBlocksFetched() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetrics
totalCores() - Method in class org.apache.spark.scheduler.cluster.ExecutorInfo
totalDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo: Time taken for all the jobs of this batch to finish processing from the time they were submitted.
totalDuration() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
totalInputBytes() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
totalIterations() - Method in interface org.apache.spark.ml.classification.LogisticRegressionTrainingSummary: Number of training iterations until termination
totalIterations() - Method in class org.apache.spark.ml.regression.LinearRegressionTrainingSummary
totalShuffleRead() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
totalShuffleWrite() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
totalTasks() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
toTuple() - Method in class org.apache.spark.graphx.EdgeTriplet
toUnscaledLong() - Method in class org.apache.spark.sql.types.Decimal
train(DataFrame) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
train(DataFrame) - Method in class org.apache.spark.ml.classification.GBTClassifier
train(DataFrame) - Method in class org.apache.spark.ml.classification.LogisticRegression
train(DataFrame) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier: Train a model using the given dataset and parameters.
train(DataFrame) - Method in class org.apache.spark.ml.classification.NaiveBayes
train(DataFrame) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
train(DataFrame) - Method in class org.apache.spark.ml.Predictor: Train a model using the given dataset and parameters.
train(RDD<ALS.Rating<ID>>, int, int, int, int, double, boolean, double, boolean, StorageLevel, StorageLevel, int, long, ClassTag<ID>, Ordering<ID>) - Static method in class org.apache.spark.ml.recommendation.ALS
train(DataFrame) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
train(DataFrame) - Method in class org.apache.spark.ml.regression.GBTRegressor
train(DataFrame) - Method in class org.apache.spark.ml.regression.LinearRegression
train(DataFrame) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
train(RDD<LabeledPoint>, int, double, double, Vector) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD: Train a logistic regression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD: Train a logistic regression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD: Train a logistic regression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD: Train a logistic regression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.classification.NaiveBayes
train(RDD<LabeledPoint>, double) - Static method in class org.apache.spark.mllib.classification.NaiveBayes
train(RDD<LabeledPoint>, double, String) - Static method in class org.apache.spark.mllib.classification.NaiveBayes
train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD: Train a SVM model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD: Train a SVM model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD: Train a SVM model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD: Train a SVM model given an RDD of (label, features) pairs.
train(RDD<Vector>, int, int, int, String, long) - Static method in class org.apache.spark.mllib.clustering.KMeans: Trains a k-means model using the given set of parameters.
train(RDD<Vector>, int, int, int, String) - Static method in class org.apache.spark.mllib.clustering.KMeans: Trains a k-means model using the given set of parameters.
train(RDD<Vector>, int, int) - Static method in class org.apache.spark.mllib.clustering.KMeans: Trains a k-means model using specified parameters and the default values for unspecified.
train(RDD<Vector>, int, int, int) - Static method in class org.apache.spark.mllib.clustering.KMeans: Trains a k-means model using specified parameters and the default values for unspecified.
train(RDD<Rating>, int, int, double, int, long) - Static method in class org.apache.spark.mllib.recommendation.ALS
train(RDD<Rating>, int, int, double, int) - Static method in class org.apache.spark.mllib.recommendation.ALS
train(RDD<Rating>, int, int, double) - Static method in class org.apache.spark.mllib.recommendation.ALS
train(RDD<Rating>, int, int) - Static method in class org.apache.spark.mllib.recommendation.ALS
train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD: Train a Lasso model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD: Train a Lasso model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD: Train a Lasso model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD: Train a Lasso model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD: Train a Linear Regression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD: Train a LinearRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD: Train a LinearRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD: Train a LinearRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD: Train a RidgeRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD: Train a RidgeRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD: Train a RidgeRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD: Train a RidgeRegression model given an RDD of (label, features) pairs.
train(RDD<LabeledPoint>, Strategy) - Static method in class org.apache.spark.mllib.tree.DecisionTree: Method to train a decision tree model.
train(RDD<LabeledPoint>, Enumeration.Value, Impurity, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree: Method to train a decision tree model.
train(RDD<LabeledPoint>, Enumeration.Value, Impurity, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree: Method to train a decision tree model.
train(RDD<LabeledPoint>, Enumeration.Value, Impurity, int, int, int, Enumeration.Value, Map<Object, Object>) - Static method in class org.apache.spark.mllib.tree.DecisionTree: Method to train a decision tree model.
train(RDD<LabeledPoint>, BoostingStrategy) - Static method in class org.apache.spark.mllib.tree.GradientBoostedTrees: Method to train a gradient boosting model.
train(JavaRDD<LabeledPoint>, BoostingStrategy) - Static method in class org.apache.spark.mllib.tree.GradientBoostedTrees: Java-friendly API for GradientBoostedTrees$.train(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, org.apache.spark.mllib.tree.configuration.BoostingStrategy)
trainClassifier(RDD<LabeledPoint>, int, Map<Object, Object>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree: Method to train a decision tree model for binary or multiclass classification.
trainClassifier(JavaRDD<LabeledPoint>, int, Map<Integer, Integer>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree: Java-friendly API for DecisionTree$.trainClassifier(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, int, scala.collection.immutable.Map<java.lang.Object, java.lang.Object>, java.lang.String, int, int)
trainClassifier(RDD<LabeledPoint>, Strategy, int, String, int) - Static method in class org.apache.spark.mllib.tree.RandomForest: Method to train a decision tree model for binary or multiclass classification.
trainClassifier(RDD<LabeledPoint>, int, Map<Object, Object>, int, String, String, int, int, int) - Static method in class org.apache.spark.mllib.tree.RandomForest: Method to train a decision tree model for binary or multiclass classification.
trainClassifier(JavaRDD<LabeledPoint>, int, Map<Integer, Integer>, int, String, String, int, int, int) - Static method in class org.apache.spark.mllib.tree.RandomForest: Java-friendly API for RandomForest$.trainClassifier(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, org.apache.spark.mllib.tree.configuration.Strategy, int, java.lang.String, int)
trainImplicit(RDD<Rating>, int, int, double, int, double, long) - Static method in class org.apache.spark.mllib.recommendation.ALS
trainImplicit(RDD<Rating>, int, int, double, int, double) - Static method in class org.apache.spark.mllib.recommendation.ALS
trainImplicit(RDD<Rating>, int, int, double, double) - Static method in class org.apache.spark.mllib.recommendation.ALS
trainImplicit(RDD<Rating>, int, int) - Static method in class org.apache.spark.mllib.recommendation.ALS
trainingLogLikelihood() - Method in class org.apache.spark.ml.clustering.DistributedLDAModel: Log likelihood of the observed tokens in the training set, given the current parameter estimates: log P(docs | topics, topic distributions for docs, Dirichlet hyperparameters)
trainOn(DStream<Vector>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans: Update the clustering model by training on batches of data from a DStream.
trainOn(JavaDStream<Vector>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans: Java-friendly version of trainOn.
trainOn(DStream<LabeledPoint>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm: Update the model by training on batches of data from a DStream.
trainOn(JavaDStream<LabeledPoint>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm: Java-friendly version of trainOn.
trainRegressor(RDD<LabeledPoint>, Map<Object, Object>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree: Method to train a decision tree model for regression.
trainRegressor(JavaRDD<LabeledPoint>, Map<Integer, Integer>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree: Java-friendly API for DecisionTree$.trainRegressor(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, scala.collection.immutable.Map<java.lang.Object, java.lang.Object>, java.lang.String, int, int)
trainRegressor(RDD<LabeledPoint>, Strategy, int, String, int) - Static method in class org.apache.spark.mllib.tree.RandomForest: Method to train a decision tree model for regression.
trainRegressor(RDD<LabeledPoint>, Map<Object, Object>, int, String, String, int, int, int) - Static method in class org.apache.spark.mllib.tree.RandomForest: Method to train a decision tree model for regression.
trainRegressor(JavaRDD<LabeledPoint>, Map<Integer, Integer>, int, String, String, int, int, int) - Static method in class org.apache.spark.mllib.tree.RandomForest: Java-friendly API for RandomForest$.trainRegressor(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, org.apache.spark.mllib.tree.configuration.Strategy, int, java.lang.String, int)
TrainValidationSplit - Class in org.apache.spark.ml.tuning: :: Experimental :: Validation for hyper-parameter tuning.
TrainValidationSplit(String) - Constructor for class org.apache.spark.ml.tuning.TrainValidationSplit
TrainValidationSplit() - Constructor for class org.apache.spark.ml.tuning.TrainValidationSplit
TrainValidationSplitModel - Class in org.apache.spark.ml.tuning: :: Experimental :: Model from train validation split.
transform(DataFrame) - Method in class org.apache.spark.ml.classification.ClassificationModel: Transforms dataset by reading from featuresCol, and appending new columns as specified by parameters: - predicted labels as predictionCol of type Double - raw predictions (confidences) as rawPredictionCol of type Vector.
transform(DataFrame) - Method in class org.apache.spark.ml.classification.OneVsRestModel
transform(DataFrame) - Method in class org.apache.spark.ml.classification.ProbabilisticClassificationModel: Transforms dataset by reading from featuresCol, and appending new columns as specified by parameters: - predicted labels as predictionCol of type Double - raw predictions (confidences) as rawPredictionCol of type Vector - probability of each class as probabilityCol of type Vector.
transform(DataFrame) - Method in class org.apache.spark.ml.clustering.KMeansModel
transform(DataFrame) - Method in class org.apache.spark.ml.clustering.LDAModel: Transforms the input dataset.
transform(DataFrame) - Method in class org.apache.spark.ml.feature.Binarizer
transform(DataFrame) - Method in class org.apache.spark.ml.feature.Bucketizer
transform(DataFrame) - Method in class org.apache.spark.ml.feature.ChiSqSelectorModel
transform(DataFrame) - Method in class org.apache.spark.ml.feature.ColumnPruner
transform(DataFrame) - Method in class org.apache.spark.ml.feature.CountVectorizerModel
transform(DataFrame) - Method in class org.apache.spark.ml.feature.HashingTF
transform(DataFrame) - Method in class org.apache.spark.ml.feature.IDFModel
transform(DataFrame) - Method in class org.apache.spark.ml.feature.IndexToString
transform(DataFrame) - Method in class org.apache.spark.ml.feature.Interaction
transform(DataFrame) - Method in class org.apache.spark.ml.feature.MinMaxScalerModel
transform(DataFrame) - Method in class org.apache.spark.ml.feature.OneHotEncoder
transform(DataFrame) - Method in class org.apache.spark.ml.feature.PCAModel
transform(DataFrame) - Method in class org.apache.spark.ml.feature.RFormulaModel
transform(DataFrame) - Method in class org.apache.spark.ml.feature.SQLTransformer
transform(DataFrame) - Method in class org.apache.spark.ml.feature.StandardScalerModel
transform(DataFrame) - Method in class org.apache.spark.ml.feature.StopWordsRemover
transform(DataFrame) - Method in class org.apache.spark.ml.feature.StringIndexerModel
transform(DataFrame) - Method in class org.apache.spark.ml.feature.VectorAssembler
transform(DataFrame) - Method in class org.apache.spark.ml.feature.VectorAttributeRewriter
transform(DataFrame) - Method in class org.apache.spark.ml.feature.VectorIndexerModel
transform(DataFrame) - Method in class org.apache.spark.ml.feature.VectorSlicer
transform(DataFrame) - Method in class org.apache.spark.ml.feature.Word2VecModel: Transform a sentence column to a vector column to represent the whole sentence.
transform(DataFrame) - Method in class org.apache.spark.ml.PipelineModel
transform(DataFrame) - Method in class org.apache.spark.ml.PredictionModel: Transforms dataset by reading from featuresCol, calling predict(), and storing the predictions as a new column predictionCol.
transform(DataFrame) - Method in class org.apache.spark.ml.recommendation.ALSModel
transform(DataFrame) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
transform(DataFrame) - Method in class org.apache.spark.ml.regression.IsotonicRegressionModel
transform(DataFrame, ParamPair<?>, ParamPair<?>...) - Method in class org.apache.spark.ml.Transformer: Transforms the dataset with optional parameters
transform(DataFrame, ParamPair<?>, Seq<ParamPair<?>>) - Method in class org.apache.spark.ml.Transformer: Transforms the dataset with optional parameters
transform(DataFrame, ParamMap) - Method in class org.apache.spark.ml.Transformer: Transforms the dataset with provided parameter map as additional parameters.
transform(DataFrame) - Method in class org.apache.spark.ml.Transformer: Transforms the input dataset.
transform(DataFrame) - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
transform(DataFrame) - Method in class org.apache.spark.ml.tuning.TrainValidationSplitModel
transform(DataFrame) - Method in class org.apache.spark.ml.UnaryTransformer
transform(Vector) - Method in class org.apache.spark.mllib.feature.ChiSqSelectorModel: Applies transformation on a vector.
transform(Vector) - Method in class org.apache.spark.mllib.feature.ElementwiseProduct: Does the hadamard product transformation.
transform(Iterable<Object>) - Method in class org.apache.spark.mllib.feature.HashingTF: Transforms the input document into a sparse term frequency vector.
transform(Iterable<?>) - Method in class org.apache.spark.mllib.feature.HashingTF: Transforms the input document into a sparse term frequency vector (Java version).
transform(RDD<D>) - Method in class org.apache.spark.mllib.feature.HashingTF: Transforms the input document to term frequency vectors.
transform(JavaRDD<D>) - Method in class org.apache.spark.mllib.feature.HashingTF: Transforms the input document to term frequency vectors (Java version).
transform(RDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDFModel: Transforms term frequency (TF) vectors to TF-IDF vectors.
transform(Vector) - Method in class org.apache.spark.mllib.feature.IDFModel: Transforms a term frequency (TF) vector to a TF-IDF vector
transform(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDFModel: Transforms term frequency (TF) vectors to TF-IDF vectors (Java version).
transform(Vector) - Method in class org.apache.spark.mllib.feature.Normalizer: Applies unit length normalization on a vector.
transform(Vector) - Method in class org.apache.spark.mllib.feature.PCAModel: Transform a vector by computed Principal Components.
transform(Vector) - Method in class org.apache.spark.mllib.feature.StandardScalerModel: Applies standardization transformation on a vector.
transform(Vector) - Method in interface org.apache.spark.mllib.feature.VectorTransformer: Applies transformation on a vector.
transform(RDD<Vector>) - Method in interface org.apache.spark.mllib.feature.VectorTransformer: Applies transformation on an RDD[Vector].
transform(JavaRDD<Vector>) - Method in interface org.apache.spark.mllib.feature.VectorTransformer: Applies transformation on an JavaRDD[Vector].
transform(String) - Method in class org.apache.spark.mllib.feature.Word2VecModel
transform(Function1<DataFrame, DataFrame>) - Method in class org.apache.spark.sql.DataFrame: Concise syntax for chaining custom transformations.
transform(Function1<Dataset<T>, Dataset>) - Method in class org.apache.spark.sql.Dataset: Concise syntax for chaining custom transformations.
transform(Function<R, JavaRDD>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transform(Function2<R, Time, JavaRDD>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transform(List<JavaDStream<?>>, Function2<List<JavaRDD<?>>, Time, JavaRDD<T>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a new DStream in which each RDD is generated by applying a function on RDDs of the DStreams.
transform(Function1<RDD<T>, RDD>, ClassTag) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transform(Function2<RDD<T>, Time, RDD>, ClassTag) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transform(Seq<DStream<?>>, Function2<Seq<RDD<?>>, Time, RDD<T>>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext: Create a new DStream in which each RDD is generated by applying a function on RDDs of the DStreams.
Transformer - Class in org.apache.spark.ml: :: DeveloperApi :: Abstract class for transformers that transform one dataset into another.
Transformer() - Constructor for class org.apache.spark.ml.Transformer
transformImpl(DataFrame) - Method in class org.apache.spark.ml.classification.GBTClassificationModel
transformImpl(DataFrame) - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel
transformImpl(DataFrame) - Method in class org.apache.spark.ml.PredictionModel
transformImpl(DataFrame) - Method in class org.apache.spark.ml.regression.GBTRegressionModel
transformImpl(DataFrame) - Method in class org.apache.spark.ml.regression.RandomForestRegressionModel
transformSchema(StructType) - Method in class org.apache.spark.ml.classification.OneVsRest
transformSchema(StructType) - Method in class org.apache.spark.ml.classification.OneVsRestModel
transformSchema(StructType) - Method in class org.apache.spark.ml.clustering.KMeans
transformSchema(StructType) - Method in class org.apache.spark.ml.clustering.KMeansModel
transformSchema(StructType) - Method in class org.apache.spark.ml.clustering.LDA
transformSchema(StructType) - Method in class org.apache.spark.ml.clustering.LDAModel
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.Binarizer
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.Bucketizer
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.ChiSqSelector
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.ChiSqSelectorModel
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.ColumnPruner
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.CountVectorizer
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.CountVectorizerModel
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.HashingTF
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.IDF
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.IDFModel
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.IndexToString
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.Interaction
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.MinMaxScaler
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.MinMaxScalerModel
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.OneHotEncoder
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.PCA
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.PCAModel
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.QuantileDiscretizer
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.RFormula
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.RFormulaModel
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.SQLTransformer
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.StandardScaler
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.StandardScalerModel
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.StopWordsRemover
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.StringIndexer
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.StringIndexerModel
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.VectorAssembler
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.VectorAttributeRewriter
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.VectorIndexer
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.VectorIndexerModel
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.VectorSlicer
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.Word2Vec
transformSchema(StructType) - Method in class org.apache.spark.ml.feature.Word2VecModel
transformSchema(StructType) - Method in class org.apache.spark.ml.Pipeline
transformSchema(StructType) - Method in class org.apache.spark.ml.PipelineModel
transformSchema(StructType) - Method in class org.apache.spark.ml.PipelineStage: :: DeveloperApi ::
transformSchema(StructType, boolean) - Method in class org.apache.spark.ml.PipelineStage: :: DeveloperApi ::
transformSchema(StructType) - Method in class org.apache.spark.ml.PredictionModel
transformSchema(StructType) - Method in class org.apache.spark.ml.Predictor
transformSchema(StructType) - Method in class org.apache.spark.ml.recommendation.ALS
transformSchema(StructType) - Method in class org.apache.spark.ml.recommendation.ALSModel
transformSchema(StructType) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
transformSchema(StructType) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
transformSchema(StructType) - Method in class org.apache.spark.ml.regression.IsotonicRegression
transformSchema(StructType) - Method in class org.apache.spark.ml.regression.IsotonicRegressionModel
transformSchema(StructType) - Method in class org.apache.spark.ml.tuning.CrossValidator
transformSchema(StructType) - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
transformSchema(StructType) - Method in class org.apache.spark.ml.tuning.TrainValidationSplit
transformSchema(StructType) - Method in class org.apache.spark.ml.tuning.TrainValidationSplitModel
transformSchema(StructType) - Method in class org.apache.spark.ml.UnaryTransformer
transformToPair(Function<R, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transformToPair(Function2<R, Time, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream.
transformToPair(List<JavaDStream<?>>, Function2<List<JavaRDD<?>>, Time, JavaPairRDD<K, V>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a new DStream in which each RDD is generated by applying a function on RDDs of the DStreams.
transformWith(JavaDStream, Function3<R, JavaRDD, Time, JavaRDD<W>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
transformWith(JavaPairDStream<K2, V2>, Function3<R, JavaPairRDD<K2, V2>, Time, JavaRDD<W>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
transformWith(DStream, Function2<RDD<T>, RDD, RDD<V>>, ClassTag, ClassTag<V>) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
transformWith(DStream, Function3<RDD<T>, RDD, Time, RDD<V>>, ClassTag, ClassTag<V>) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
transformWithToPair(JavaDStream, Function3<R, JavaRDD, Time, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
transformWithToPair(JavaPairDStream<K2, V2>, Function3<R, JavaPairRDD<K2, V2>, Time, JavaPairRDD<K3, V3>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike: Return a new DStream in which each RDD is generated by applying a function on each RDD of 'this' DStream and 'other' DStream.
translate(Column, String, String) - Static method in class org.apache.spark.sql.functions: Translate any character in the src by a character in replaceString.
transpose() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
transpose() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix: Transpose this BlockMatrix.
transpose() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
transpose() - Method in interface org.apache.spark.mllib.linalg.Matrix: Transpose the Matrix.
transpose() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
treeAggregate(U, Function2<U, T, U>, Function2<U, U, U>, int) - Method in interface org.apache.spark.api.java.JavaRDDLike: Aggregates the elements of this RDD in a multi-level tree pattern.
treeAggregate(U, Function2<U, T, U>, Function2<U, U, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike: JavaRDDLike.treeAggregate(U, org.apache.spark.api.java.function.Function2<U, T, U>, org.apache.spark.api.java.function.Function2<U, U, U>, int) with suggested depth 2.
treeAggregate(U, Function2<U, T, U>, Function2<U, U, U>, int, ClassTag) - Method in class org.apache.spark.mllib.rdd.RDDFunctions: Deprecated.
Use RDD.treeAggregate(U, scala.Function2<U, T, U>, scala.Function2<U, U, U>, int, scala.reflect.ClassTag) instead.
treeAggregate(U, Function2<U, T, U>, Function2<U, U, U>, int, ClassTag) - Method in class org.apache.spark.rdd.RDD: Aggregates the elements of this RDD in a multi-level tree pattern.
treeReduce(Function2<T, T, T>, int) - Method in interface org.apache.spark.api.java.JavaRDDLike: Reduces the elements of this RDD in a multi-level tree pattern.
treeReduce(Function2<T, T, T>) - Method in interface org.apache.spark.api.java.JavaRDDLike: JavaRDDLike.treeReduce(org.apache.spark.api.java.function.Function2<T, T, T>, int) with suggested depth 2.
treeReduce(Function2<T, T, T>, int) - Method in class org.apache.spark.mllib.rdd.RDDFunctions: Deprecated.
Use RDD.treeReduce(scala.Function2<T, T, T>, int) instead.
treeReduce(Function2<T, T, T>, int) - Method in class org.apache.spark.rdd.RDD: Reduces the elements of this RDD in a multi-level tree pattern.
trees() - Method in class org.apache.spark.ml.classification.GBTClassificationModel
trees() - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel
trees() - Method in class org.apache.spark.ml.regression.GBTRegressionModel
trees() - Method in class org.apache.spark.ml.regression.RandomForestRegressionModel
trees() - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
trees() - Method in class org.apache.spark.mllib.tree.model.RandomForestModel
treeStrategy() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
treeString() - Method in class org.apache.spark.sql.types.StructType
treeWeights() - Method in class org.apache.spark.ml.classification.GBTClassificationModel
treeWeights() - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel
treeWeights() - Method in class org.apache.spark.ml.regression.GBTRegressionModel
treeWeights() - Method in class org.apache.spark.ml.regression.RandomForestRegressionModel
treeWeights() - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
triangleCount() - Method in class org.apache.spark.graphx.GraphOps: Compute the number of triangles passing through each vertex.
TriangleCount - Class in org.apache.spark.graphx.lib: Compute the number of triangles passing through each vertex.
TriangleCount() - Constructor for class org.apache.spark.graphx.lib.TriangleCount
trim(Column) - Static method in class org.apache.spark.sql.functions: Trim the spaces from both ends for the specified string column.
TripletFields - Class in org.apache.spark.graphx: Represents a subset of the fields of an [[EdgeTriplet]] or [[EdgeContext]].
TripletFields() - Constructor for class org.apache.spark.graphx.TripletFields: Constructs a default TripletFields in which all fields are included.
TripletFields(boolean, boolean, boolean) - Constructor for class org.apache.spark.graphx.TripletFields
triplets() - Method in class org.apache.spark.graphx.Graph: An RDD containing the edge triplets, which are edges along with the vertex data associated with the adjacent vertices.
triplets() - Method in class org.apache.spark.graphx.impl.GraphImpl: Return a RDD that brings edges together with their source and destination vertices.
truePositiveRate(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns true positive rate for a given label (category)
trunc(Column, String) - Static method in class org.apache.spark.sql.functions: Returns date truncated to the unit specified by the format.
tuple(Encoder<T1>, Encoder<T2>) - Static method in class org.apache.spark.sql.Encoders: An encoder for 2-ary tuples.
tuple(Encoder<T1>, Encoder<T2>, Encoder<T3>) - Static method in class org.apache.spark.sql.Encoders: An encoder for 3-ary tuples.
tuple(Encoder<T1>, Encoder<T2>, Encoder<T3>, Encoder<T4>) - Static method in class org.apache.spark.sql.Encoders: An encoder for 4-ary tuples.
tuple(Encoder<T1>, Encoder<T2>, Encoder<T3>, Encoder<T4>, Encoder<T5>) - Static method in class org.apache.spark.sql.Encoders: An encoder for 5-ary tuples.
tValues() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary: T-statistic of estimated coefficients and intercept.
TwitterUtils - Class in org.apache.spark.streaming.twitter
TwitterUtils() - Constructor for class org.apache.spark.streaming.twitter.TwitterUtils
TypedColumn<T,U> - Class in org.apache.spark.sql: A Column where an Encoder has been given for the expected input and return type.
TypedColumn(Expression, ExpressionEncoder) - Constructor for class org.apache.spark.sql.TypedColumn
typeName() - Method in class org.apache.spark.mllib.linalg.VectorUDT
typeName() - Method in class org.apache.spark.sql.types.DataType: Name of the type used in JSON serialization.
typeName() - Method in class org.apache.spark.sql.types.DecimalType

U

U() - Method in class org.apache.spark.mllib.linalg.SingularValueDecomposition
udf(Function0<RT>, TypeTags.TypeTag<RT>) - Static method in class org.apache.spark.sql.functions: Defines a user-defined function of 0 arguments as user-defined function (UDF).
udf(Function1<A1, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>) - Static method in class org.apache.spark.sql.functions: Defines a user-defined function of 1 arguments as user-defined function (UDF).
udf(Function2<A1, A2, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>) - Static method in class org.apache.spark.sql.functions: Defines a user-defined function of 2 arguments as user-defined function (UDF).
udf(Function3<A1, A2, A3, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>) - Static method in class org.apache.spark.sql.functions: Defines a user-defined function of 3 arguments as user-defined function (UDF).
udf(Function4<A1, A2, A3, A4, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>) - Static method in class org.apache.spark.sql.functions: Defines a user-defined function of 4 arguments as user-defined function (UDF).
udf(Function5<A1, A2, A3, A4, A5, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>) - Static method in class org.apache.spark.sql.functions: Defines a user-defined function of 5 arguments as user-defined function (UDF).
udf(Function6<A1, A2, A3, A4, A5, A6, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>) - Static method in class org.apache.spark.sql.functions: Defines a user-defined function of 6 arguments as user-defined function (UDF).
udf(Function7<A1, A2, A3, A4, A5, A6, A7, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>) - Static method in class org.apache.spark.sql.functions: Defines a user-defined function of 7 arguments as user-defined function (UDF).
udf(Function8<A1, A2, A3, A4, A5, A6, A7, A8, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>) - Static method in class org.apache.spark.sql.functions: Defines a user-defined function of 8 arguments as user-defined function (UDF).
udf(Function9<A1, A2, A3, A4, A5, A6, A7, A8, A9, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>) - Static method in class org.apache.spark.sql.functions: Defines a user-defined function of 9 arguments as user-defined function (UDF).
udf(Function10<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>) - Static method in class org.apache.spark.sql.functions: Defines a user-defined function of 10 arguments as user-defined function (UDF).
udf() - Method in class org.apache.spark.sql.SQLContext: A collection of methods for registering user-defined functions (UDF).
UDF1<T1,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 1 arguments.
UDF10<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 10 arguments.
UDF11<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 11 arguments.
UDF12<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 12 arguments.
UDF13<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 13 arguments.
UDF14<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 14 arguments.
UDF15<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 15 arguments.
UDF16<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 16 arguments.
UDF17<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 17 arguments.
UDF18<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 18 arguments.
UDF19<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 19 arguments.
UDF2<T1,T2,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 2 arguments.
UDF20<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,T20,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 20 arguments.
UDF21<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,T20,T21,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 21 arguments.
UDF22<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,T20,T21,T22,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 22 arguments.
UDF3<T1,T2,T3,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 3 arguments.
UDF4<T1,T2,T3,T4,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 4 arguments.
UDF5<T1,T2,T3,T4,T5,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 5 arguments.
UDF6<T1,T2,T3,T4,T5,T6,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 6 arguments.
UDF7<T1,T2,T3,T4,T5,T6,T7,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 7 arguments.
UDF8<T1,T2,T3,T4,T5,T6,T7,T8,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 8 arguments.
UDF9<T1,T2,T3,T4,T5,T6,T7,T8,T9,R> - Interface in org.apache.spark.sql.api.java: A Spark SQL UDF that has 9 arguments.
UDFRegistration - Class in org.apache.spark.sql: Functions for registering user-defined functions.
uid() - Method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
uid() - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
uid() - Method in class org.apache.spark.ml.classification.GBTClassificationModel
uid() - Method in class org.apache.spark.ml.classification.GBTClassifier
uid() - Method in class org.apache.spark.ml.classification.LogisticRegression
uid() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
uid() - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassificationModel
uid() - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier
uid() - Method in class org.apache.spark.ml.classification.NaiveBayes
uid() - Method in class org.apache.spark.ml.classification.NaiveBayesModel
uid() - Method in class org.apache.spark.ml.classification.OneVsRest
uid() - Method in class org.apache.spark.ml.classification.OneVsRestModel
uid() - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel
uid() - Method in class org.apache.spark.ml.classification.RandomForestClassifier
uid() - Method in class org.apache.spark.ml.clustering.KMeans
uid() - Method in class org.apache.spark.ml.clustering.KMeansModel
uid() - Method in class org.apache.spark.ml.clustering.LDA
uid() - Method in class org.apache.spark.ml.clustering.LDAModel
uid() - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
uid() - Method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
uid() - Method in class org.apache.spark.ml.evaluation.RegressionEvaluator
uid() - Method in class org.apache.spark.ml.feature.Binarizer
uid() - Method in class org.apache.spark.ml.feature.Bucketizer
uid() - Method in class org.apache.spark.ml.feature.ChiSqSelector
uid() - Method in class org.apache.spark.ml.feature.ChiSqSelectorModel
uid() - Method in class org.apache.spark.ml.feature.ColumnPruner
uid() - Method in class org.apache.spark.ml.feature.CountVectorizer
uid() - Method in class org.apache.spark.ml.feature.CountVectorizerModel
uid() - Method in class org.apache.spark.ml.feature.DCT
uid() - Method in class org.apache.spark.ml.feature.ElementwiseProduct
uid() - Method in class org.apache.spark.ml.feature.HashingTF
uid() - Method in class org.apache.spark.ml.feature.IDF
uid() - Method in class org.apache.spark.ml.feature.IDFModel
uid() - Method in class org.apache.spark.ml.feature.IndexToString
uid() - Method in class org.apache.spark.ml.feature.Interaction
uid() - Method in class org.apache.spark.ml.feature.MinMaxScaler
uid() - Method in class org.apache.spark.ml.feature.MinMaxScalerModel
uid() - Method in class org.apache.spark.ml.feature.NGram
uid() - Method in class org.apache.spark.ml.feature.Normalizer
uid() - Method in class org.apache.spark.ml.feature.OneHotEncoder
uid() - Method in class org.apache.spark.ml.feature.PCA
uid() - Method in class org.apache.spark.ml.feature.PCAModel
uid() - Method in class org.apache.spark.ml.feature.PolynomialExpansion
uid() - Method in class org.apache.spark.ml.feature.QuantileDiscretizer
uid() - Method in class org.apache.spark.ml.feature.RegexTokenizer
uid() - Method in class org.apache.spark.ml.feature.RFormula
uid() - Method in class org.apache.spark.ml.feature.RFormulaModel
uid() - Method in class org.apache.spark.ml.feature.SQLTransformer
uid() - Method in class org.apache.spark.ml.feature.StandardScaler
uid() - Method in class org.apache.spark.ml.feature.StandardScalerModel
uid() - Method in class org.apache.spark.ml.feature.StopWordsRemover
uid() - Method in class org.apache.spark.ml.feature.StringIndexer
uid() - Method in class org.apache.spark.ml.feature.StringIndexerModel
uid() - Method in class org.apache.spark.ml.feature.Tokenizer
uid() - Method in class org.apache.spark.ml.feature.VectorAssembler
uid() - Method in class org.apache.spark.ml.feature.VectorAttributeRewriter
uid() - Method in class org.apache.spark.ml.feature.VectorIndexer
uid() - Method in class org.apache.spark.ml.feature.VectorIndexerModel
uid() - Method in class org.apache.spark.ml.feature.VectorSlicer
uid() - Method in class org.apache.spark.ml.feature.Word2Vec
uid() - Method in class org.apache.spark.ml.feature.Word2VecModel
uid() - Method in class org.apache.spark.ml.Pipeline
uid() - Method in class org.apache.spark.ml.PipelineModel
uid() - Method in class org.apache.spark.ml.recommendation.ALS
uid() - Method in class org.apache.spark.ml.recommendation.ALSModel
uid() - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
uid() - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
uid() - Method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
uid() - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
uid() - Method in class org.apache.spark.ml.regression.GBTRegressionModel
uid() - Method in class org.apache.spark.ml.regression.GBTRegressor
uid() - Method in class org.apache.spark.ml.regression.IsotonicRegression
uid() - Method in class org.apache.spark.ml.regression.IsotonicRegressionModel
uid() - Method in class org.apache.spark.ml.regression.LinearRegression
uid() - Method in class org.apache.spark.ml.regression.LinearRegressionModel
uid() - Method in class org.apache.spark.ml.regression.RandomForestRegressionModel
uid() - Method in class org.apache.spark.ml.regression.RandomForestRegressor
uid() - Method in class org.apache.spark.ml.tuning.CrossValidator
uid() - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
uid() - Method in class org.apache.spark.ml.tuning.TrainValidationSplit
uid() - Method in class org.apache.spark.ml.tuning.TrainValidationSplitModel
uid() - Method in interface org.apache.spark.ml.util.Identifiable: An immutable unique ID for the object and its derivatives.
uiTab() - Method in class org.apache.spark.streaming.StreamingContext
unapply(EdgeContext<VD, ED, A>) - Static method in class org.apache.spark.graphx.EdgeContext: Extractor mainly used for Graph#aggregateMessages*.
unapply(DenseVector) - Static method in class org.apache.spark.mllib.linalg.DenseVector: Extracts the value array from a dense vector.
unapply(SparseVector) - Static method in class org.apache.spark.mllib.linalg.SparseVector
unapply(Column) - Static method in class org.apache.spark.sql.Column
unapply(DataType) - Static method in class org.apache.spark.sql.types.DecimalType
unapply(Expression) - Static method in class org.apache.spark.sql.types.DecimalType
unapply(Expression) - Static method in class org.apache.spark.sql.types.NumericType: Enables matching against NumericType for expressions:
unapply(Broker) - Static method in class org.apache.spark.streaming.kafka.Broker
UnaryTransformer<IN,OUT,T extends UnaryTransformer<IN,OUT,T>> - Class in org.apache.spark.ml: :: DeveloperApi :: Abstract class for transformers that take one input column, apply transformation, and output the result as a new column.
UnaryTransformer() - Constructor for class org.apache.spark.ml.UnaryTransformer
unbase64(Column) - Static method in class org.apache.spark.sql.functions: Decodes a BASE64 encoded string column and returns it as a binary column.
unbroadcast(long, boolean, boolean) - Method in interface org.apache.spark.broadcast.BroadcastFactory
unbroadcast(long, boolean, boolean) - Method in class org.apache.spark.broadcast.HttpBroadcastFactory: Remove all persisted state associated with the HTTP broadcast with the given ID.
unbroadcast(long, boolean, boolean) - Method in class org.apache.spark.broadcast.TorrentBroadcastFactory: Remove all persisted state associated with the torrent broadcast with the given ID.
uncacheTable(String) - Method in class org.apache.spark.sql.SQLContext: Removes the specified table from the in-memory cache.
underlyingSplit() - Method in class org.apache.spark.scheduler.SplitInfo
unhandledFilters(Filter[]) - Method in class org.apache.spark.sql.sources.BaseRelation: Returns the list of Filters that this datasource may not be able to handle.
unhex(Column) - Static method in class org.apache.spark.sql.functions: Inverse of hex.
UniformGenerator - Class in org.apache.spark.mllib.random: :: DeveloperApi :: Generates i.i.d.
UniformGenerator() - Constructor for class org.apache.spark.mllib.random.UniformGenerator
uniformJavaRDD(JavaSparkContext, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Java-friendly version of RandomRDDs.uniformRDD(org.apache.spark.SparkContext, long, int, long).
uniformJavaRDD(JavaSparkContext, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.uniformJavaRDD(org.apache.spark.api.java.JavaSparkContext, long, int, long) with the default seed.
uniformJavaRDD(JavaSparkContext, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.uniformJavaRDD(org.apache.spark.api.java.JavaSparkContext, long, int, long) with the default number of partitions and the default seed.
uniformJavaVectorRDD(JavaSparkContext, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Java-friendly version of RandomRDDs.uniformVectorRDD(org.apache.spark.SparkContext, long, int, int, long).
uniformJavaVectorRDD(JavaSparkContext, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.uniformJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, long, int, int, long) with the default seed.
uniformJavaVectorRDD(JavaSparkContext, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs: RandomRDDs.uniformJavaVectorRDD(org.apache.spark.api.java.JavaSparkContext, long, int, int, long) with the default number of partitions and the default seed.
uniformRDD(SparkContext, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Generates an RDD comprised of i.i.d. samples from the uniform distribution U(0.0, 1.0).
uniformVectorRDD(SparkContext, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs: Generates an RDD[Vector] with vectors containing i.i.d. samples drawn from the uniform distribution on U(0.0, 1.0).
union(JavaDoubleRDD) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Return the union of this RDD and another one.
union(JavaPairRDD<K, V>) - Method in class org.apache.spark.api.java.JavaPairRDD: Return the union of this RDD and another one.
union(JavaRDD<T>) - Method in class org.apache.spark.api.java.JavaRDD: Return the union of this RDD and another one.
union(JavaRDD<T>, List<JavaRDD<T>>) - Method in class org.apache.spark.api.java.JavaSparkContext: Build the union of two or more RDDs.
union(JavaPairRDD<K, V>, List<JavaPairRDD<K, V>>) - Method in class org.apache.spark.api.java.JavaSparkContext: Build the union of two or more RDDs.
union(JavaDoubleRDD, List<JavaDoubleRDD>) - Method in class org.apache.spark.api.java.JavaSparkContext: Build the union of two or more RDDs.
union(RDD<T>) - Method in class org.apache.spark.rdd.RDD: Return the union of this RDD and another one.
union(Seq<RDD<T>>, ClassTag<T>) - Method in class org.apache.spark.SparkContext: Build the union of a list of RDDs.
union(RDD<T>, Seq<RDD<T>>, ClassTag<T>) - Method in class org.apache.spark.SparkContext: Build the union of a list of RDDs passed as variable-length arguments.
union(Dataset<T>) - Method in class org.apache.spark.sql.Dataset: Returns a new Dataset that contains the elements of both this and the other Dataset combined.
union(JavaDStream<T>) - Method in class org.apache.spark.streaming.api.java.JavaDStream: Return a new DStream by unifying data of another DStream with this DStream.
union(JavaPairDStream<K, V>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream by unifying data of another DStream with this DStream.
union(JavaDStream<T>, List<JavaDStream<T>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a unified DStream from multiple DStreams of the same type and same slide duration.
union(JavaPairDStream<K, V>, List<JavaPairDStream<K, V>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext: Create a unified DStream from multiple DStreams of the same type and same slide duration.
union(DStream<T>) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream by unifying data of another DStream with this DStream.
union(Seq<DStream<T>>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext: Create a unified DStream from multiple DStreams of the same type and same slide duration.
unionAll(DataFrame) - Method in class org.apache.spark.sql.DataFrame: Returns a new DataFrame containing union of rows in this frame and another frame.
UnionRDD<T> - Class in org.apache.spark.rdd
UnionRDD(SparkContext, Seq<RDD<T>>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.UnionRDD
uniqueId() - Method in class org.apache.spark.storage.StreamBlockId
unix_timestamp() - Static method in class org.apache.spark.sql.functions: Gets current Unix timestamp in seconds.
unix_timestamp(Column) - Static method in class org.apache.spark.sql.functions: Converts time string in format yyyy-MM-dd HH:mm:ss to Unix timestamp (in seconds), using the default timezone and the default locale, return null if fail.
unix_timestamp(Column, String) - Static method in class org.apache.spark.sql.functions: Convert time string with given pattern (see [http://docs.oracle.com/javase/tutorial/i18n/format/simpleDateFormat.html]) to Unix time stamp (in seconds), return null if fail.
UnknownReason - Class in org.apache.spark: :: DeveloperApi :: We don't know why the task ended -- for example, because of a ClassNotFound exception when deserializing the task result.
UnknownReason() - Constructor for class org.apache.spark.UnknownReason
Unlimited() - Static method in class org.apache.spark.sql.types.DecimalType
unpersist() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist(boolean) - Method in class org.apache.spark.api.java.JavaDoubleRDD: Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist() - Method in class org.apache.spark.api.java.JavaPairRDD: Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist(boolean) - Method in class org.apache.spark.api.java.JavaPairRDD: Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist() - Method in class org.apache.spark.api.java.JavaRDD: Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist(boolean) - Method in class org.apache.spark.api.java.JavaRDD: Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist() - Method in class org.apache.spark.broadcast.Broadcast: Asynchronously delete cached copies of this broadcast on the executors.
unpersist(boolean) - Method in class org.apache.spark.broadcast.Broadcast: Delete cached copies of this broadcast on the executors.
unpersist(boolean) - Method in class org.apache.spark.graphx.Graph: Uncaches both vertices and edges of this graph.
unpersist(boolean) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
unpersist(boolean) - Method in class org.apache.spark.graphx.impl.GraphImpl
unpersist(boolean) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
unpersist() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics: Unpersist intermediate RDDs used in the computation.
unpersist(boolean) - Method in class org.apache.spark.rdd.RDD: Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
unpersist(boolean) - Method in class org.apache.spark.sql.DataFrame: Mark the DataFrame as non-persistent, and remove all blocks for it from memory and disk.
unpersist() - Method in class org.apache.spark.sql.DataFrame: Mark the DataFrame as non-persistent, and remove all blocks for it from memory and disk.
unpersist(boolean) - Method in class org.apache.spark.sql.Dataset: Mark the Dataset as non-persistent, and remove all blocks for it from memory and disk.
unpersist() - Method in class org.apache.spark.sql.Dataset: Mark the Dataset as non-persistent, and remove all blocks for it from memory and disk.
unpersistVertices(boolean) - Method in class org.apache.spark.graphx.Graph: Uncaches only the vertices of this graph, leaving the edges alone.
unpersistVertices(boolean) - Method in class org.apache.spark.graphx.impl.GraphImpl
unregister(QueryExecutionListener) - Method in class org.apache.spark.sql.util.ExecutionListenerManager: Unregisters the specified QueryExecutionListener.
unregisterDialect(JdbcDialect) - Static method in class org.apache.spark.sql.jdbc.JdbcDialects: Unregister a dialect.
Unresolved() - Static method in class org.apache.spark.ml.attribute.AttributeType: Unresolved type.
UnresolvedAttribute - Class in org.apache.spark.ml.attribute: :: DeveloperApi :: An unresolved attribute.
UnresolvedAttribute() - Constructor for class org.apache.spark.ml.attribute.UnresolvedAttribute
unresolvedTEncoder() - Method in class org.apache.spark.sql.Dataset: An unresolved version of the internal encoder for the type of this Dataset.
unset() - Static method in class org.apache.spark.TaskContext: Unset the thread local TaskContext.
until(Time, Duration) - Method in class org.apache.spark.streaming.Time
untilOffset() - Method in class org.apache.spark.streaming.kafka.OffsetRange
update(RDD<Vector>, double, String) - Method in class org.apache.spark.mllib.clustering.StreamingKMeansModel: Perform a k-means update on a batch of data.
update(int, int, double) - Method in interface org.apache.spark.mllib.linalg.Matrix: Update element at (i, j)
update(Function1<Object, Object>) - Method in interface org.apache.spark.mllib.linalg.Matrix: Update all the values of this matrix using the function f.
update() - Method in class org.apache.spark.scheduler.AccumulableInfo
update(int, Object) - Method in class org.apache.spark.sql.expressions.MutableAggregationBuffer: Update the ith value of this buffer.
update(MutableAggregationBuffer, Row) - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction: Updates the given aggregation buffer buffer with new input data from input.
update() - Method in class org.apache.spark.status.api.v1.AccumulableInfo
update(S) - Method in class org.apache.spark.streaming.State: Update the state with a new value.
update(T1, T2) - Method in class org.apache.spark.util.MutablePair: Updates this pair with new values and returns itself
updateAggregateMetrics(UIData.StageUIData, String, TaskMetrics, Option<TaskMetrics>) - Method in class org.apache.spark.ui.jobs.JobProgressListener: Upon receiving new metrics for a task, updates the per-stage and per-executor-per-stage aggregate metrics by calculating deltas between the currently recorded metrics and the new metrics.
updatePredictionError(RDD<LabeledPoint>, RDD<Tuple2<Object, Object>>, double, DecisionTreeModel, Loss) - Static method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel: :: DeveloperApi :: Update a zipped predictionError RDD (as obtained with computeInitialPredictionAndError)
Updater - Class in org.apache.spark.mllib.optimization: :: DeveloperApi :: Class used to perform steps (weight update) using Gradient Descent methods.
Updater() - Constructor for class org.apache.spark.mllib.optimization.Updater
updateStateByKey(Function2<List<V>, Optional<S>, Optional<S>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of each key.
updateStateByKey(Function2<List<V>, Optional<S>, Optional<S>>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of each key.
updateStateByKey(Function2<List<V>, Optional<S>, Optional<S>>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of the key.
updateStateByKey(Function2<List<V>, Optional<S>, Optional<S>>, Partitioner, JavaPairRDD<K, S>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of the key.
updateStateByKey(Function2<Seq<V>, Option<S>, Option<S>>, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of each key.
updateStateByKey(Function2<Seq<V>, Option<S>, Option<S>>, int, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of each key.
updateStateByKey(Function2<Seq<V>, Option<S>, Option<S>>, Partitioner, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of the key.
updateStateByKey(Function1<Iterator<Tuple3<K, Seq<V>, Option<S>>>, Iterator<Tuple2<K, S>>>, Partitioner, boolean, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of each key.
updateStateByKey(Function2<Seq<V>, Option<S>, Option<S>>, Partitioner, RDD<Tuple2<K, S>>, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of the key.
updateStateByKey(Function1<Iterator<Tuple3<K, Seq<V>, Option<S>>>, Iterator<Tuple2<K, S>>>, Partitioner, boolean, RDD<Tuple2<K, S>>, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions: Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of each key.
upper(Column) - Static method in class org.apache.spark.sql.functions: Converts a string column to upper case.
useDisk() - Method in class org.apache.spark.storage.StorageLevel
useDst - Variable in class org.apache.spark.graphx.TripletFields: Indicates whether the destination vertex attribute is included.
useEdge - Variable in class org.apache.spark.graphx.TripletFields: Indicates whether the edge attribute is included.
useMemory() - Method in class org.apache.spark.storage.StorageLevel
useNodeIdCache() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
useOffHeap() - Method in class org.apache.spark.storage.StorageLevel
user() - Method in class org.apache.spark.ml.recommendation.ALS.Rating
user() - Method in class org.apache.spark.mllib.recommendation.Rating
user() - Method in class org.apache.spark.scheduler.JobLogger
USER_DEFAULT() - Static method in class org.apache.spark.sql.types.DecimalType
userClass() - Method in class org.apache.spark.mllib.linalg.VectorUDT
userClass() - Method in class org.apache.spark.sql.types.UserDefinedType: Class object for the UserType
UserDefinedAggregateFunction - Class in org.apache.spark.sql.expressions: :: Experimental :: The base class for implementing user-defined aggregate functions (UDAF).
UserDefinedAggregateFunction() - Constructor for class org.apache.spark.sql.expressions.UserDefinedAggregateFunction
UserDefinedFunction - Class in org.apache.spark.sql: A user-defined function.
UserDefinedFunction(Object, DataType, Seq<DataType>) - Constructor for class org.apache.spark.sql.UserDefinedFunction
userDefinedPartitionColumns() - Method in class org.apache.spark.sql.sources.HadoopFsRelation: Optional user defined partition columns.
UserDefinedType<UserType> - Class in org.apache.spark.sql.types: ::DeveloperApi:: The data type for User Defined Types (UDTs).
UserDefinedType() - Constructor for class org.apache.spark.sql.types.UserDefinedType
userFactors() - Method in class org.apache.spark.ml.recommendation.ALSModel
userFeatures() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
useSrc - Variable in class org.apache.spark.graphx.TripletFields: Indicates whether the source vertex attribute is included.

V

V() - Method in class org.apache.spark.mllib.linalg.SingularValueDecomposition
validate() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix: Validates the block matrix info against the matrix data (blocks) and throws an exception if any error is found.
validateData() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
validateInputType(DataType) - Method in class org.apache.spark.ml.feature.DCT
validateInputType(DataType) - Method in class org.apache.spark.ml.feature.NGram
validateInputType(DataType) - Method in class org.apache.spark.ml.feature.RegexTokenizer
validateInputType(DataType) - Method in class org.apache.spark.ml.feature.Tokenizer
validateInputType(DataType) - Method in class org.apache.spark.ml.UnaryTransformer: Validates the input type.
validateParams() - Method in class org.apache.spark.ml.feature.Interaction
validateParams() - Method in class org.apache.spark.ml.feature.VectorSlicer
validateParams() - Method in interface org.apache.spark.ml.param.Params
validateParams() - Method in class org.apache.spark.ml.Pipeline
validateParams() - Method in class org.apache.spark.ml.PipelineModel
validateParams() - Method in class org.apache.spark.ml.tuning.CrossValidator
validateParams() - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
validateParams() - Method in class org.apache.spark.ml.tuning.TrainValidationSplit
validateParams() - Method in class org.apache.spark.ml.tuning.TrainValidationSplitModel
validationMetrics() - Method in class org.apache.spark.ml.tuning.TrainValidationSplitModel
validationTol() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
validators() - Method in class org.apache.spark.mllib.classification.LogisticRegressionWithLBFGS
validators() - Method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
validators() - Method in class org.apache.spark.mllib.classification.SVMWithSGD
validators() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
value() - Method in class org.apache.spark.Accumulable: Access the accumulator's current value; only allowed on master.
value() - Method in class org.apache.spark.broadcast.Broadcast: Get the broadcasted value.
value() - Method in class org.apache.spark.ComplexFutureAction
value() - Method in interface org.apache.spark.FutureAction: The value of this Future.
value() - Method in class org.apache.spark.ml.param.ParamPair
value() - Method in class org.apache.spark.mllib.linalg.distributed.MatrixEntry
value() - Method in class org.apache.spark.mllib.stat.test.BinarySample
value() - Method in class org.apache.spark.scheduler.AccumulableInfo
value() - Method in class org.apache.spark.SerializableWritable
value() - Method in class org.apache.spark.SimpleFutureAction
value() - Method in class org.apache.spark.sql.sources.EqualNullSafe
value() - Method in class org.apache.spark.sql.sources.EqualTo
value() - Method in class org.apache.spark.sql.sources.GreaterThan
value() - Method in class org.apache.spark.sql.sources.GreaterThanOrEqual
value() - Method in class org.apache.spark.sql.sources.LessThan
value() - Method in class org.apache.spark.sql.sources.LessThanOrEqual
value() - Method in class org.apache.spark.sql.sources.StringContains
value() - Method in class org.apache.spark.sql.sources.StringEndsWith
value() - Method in class org.apache.spark.sql.sources.StringStartsWith
value() - Method in class org.apache.spark.status.api.v1.AccumulableInfo
value() - Method in class org.apache.spark.storage.MemoryEntry
valueClassName() - Method in class org.apache.spark.ShuffleDependency
valueContainsNull() - Method in class org.apache.spark.sql.types.MapType
valueOf(String) - Static method in enum org.apache.spark.graphx.impl.EdgeActiveness: Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.spark.JobExecutionStatus: Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.spark.launcher.SparkAppHandle.State: Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.spark.sql.SaveMode: Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.spark.status.api.v1.ApplicationStatus: Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.spark.status.api.v1.StageStatus: Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.spark.status.api.v1.TaskSorting: Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.spark.streaming.StreamingContextState: Returns the enum constant of this type with the specified name.
values() - Method in class org.apache.spark.api.java.JavaPairRDD: Return an RDD with the values of each tuple.
values() - Static method in enum org.apache.spark.graphx.impl.EdgeActiveness: Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.spark.JobExecutionStatus: Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.spark.launcher.SparkAppHandle.State: Returns an array containing the constants of this enum type, in the order they are declared.
values() - Method in class org.apache.spark.ml.attribute.BinaryAttribute
values() - Method in class org.apache.spark.ml.attribute.NominalAttribute
values() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
values() - Method in class org.apache.spark.mllib.linalg.DenseVector
values() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
values() - Method in class org.apache.spark.mllib.linalg.SparseVector
values() - Method in class org.apache.spark.rdd.PairRDDFunctions: Return an RDD with the values of each tuple.
values() - Static method in enum org.apache.spark.sql.SaveMode: Returns an array containing the constants of this enum type, in the order they are declared.
values() - Method in class org.apache.spark.sql.sources.In
values() - Static method in enum org.apache.spark.status.api.v1.ApplicationStatus: Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.spark.status.api.v1.StageStatus: Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.spark.status.api.v1.TaskSorting: Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.spark.streaming.StreamingContextState: Returns an array containing the constants of this enum type, in the order they are declared.
valueType() - Method in class org.apache.spark.sql.types.MapType
var_pop(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the population variance of the values in a group.
var_pop(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the population variance of the values in a group.
var_samp(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the unbiased variance of the values in a group.
var_samp(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: returns the unbiased variance of the values in a group.
variance() - Method in class org.apache.spark.api.java.JavaDoubleRDD: Compute the variance of this RDD's elements.
variance() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer: Unbiased estimate of sample variance of each dimension.
variance() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary: Sample variance vector.
Variance - Class in org.apache.spark.mllib.tree.impurity: :: Experimental :: Class for calculating variance during regression
Variance() - Constructor for class org.apache.spark.mllib.tree.impurity.Variance
variance() - Method in class org.apache.spark.rdd.DoubleRDDFunctions: Compute the variance of this RDD's elements.
variance(Column) - Static method in class org.apache.spark.sql.functions: Aggregate function: alias for var_samp.
variance(String) - Static method in class org.apache.spark.sql.functions: Aggregate function: alias for var_samp.
variance() - Method in class org.apache.spark.util.StatCounter: Return the variance of the values.
vClassTag() - Method in class org.apache.spark.api.java.JavaHadoopRDD
vClassTag() - Method in class org.apache.spark.api.java.JavaNewHadoopRDD
vClassTag() - Method in class org.apache.spark.api.java.JavaPairRDD
vClassTag() - Method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
vClassTag() - Method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
vdTag() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
vdTag() - Method in class org.apache.spark.graphx.VertexRDD
vector() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRow
Vector - Interface in org.apache.spark.mllib.linalg: Represents a numeric vector, whose index type is Int and value type is Double.
Vector - Class in org.apache.spark.util
Vector(double[]) - Constructor for class org.apache.spark.util.Vector
Vector.Multiplier - Class in org.apache.spark.util
Vector.Multiplier(double) - Constructor for class org.apache.spark.util.Vector.Multiplier
Vector.VectorAccumParam$ - Class in org.apache.spark.util
Vector.VectorAccumParam$() - Constructor for class org.apache.spark.util.Vector.VectorAccumParam$
VectorAssembler - Class in org.apache.spark.ml.feature: :: Experimental :: A feature transformer that merges multiple columns into a vector column.
VectorAssembler(String) - Constructor for class org.apache.spark.ml.feature.VectorAssembler
VectorAssembler() - Constructor for class org.apache.spark.ml.feature.VectorAssembler
VectorAttributeRewriter - Class in org.apache.spark.ml.feature: Utility transformer that rewrites Vector attribute names via prefix replacement.
VectorAttributeRewriter(String, Map<String, String>) - Constructor for class org.apache.spark.ml.feature.VectorAttributeRewriter
VectorIndexer - Class in org.apache.spark.ml.feature: :: Experimental :: Class for indexing categorical feature columns in a dataset of Vector.
VectorIndexer(String) - Constructor for class org.apache.spark.ml.feature.VectorIndexer
VectorIndexer() - Constructor for class org.apache.spark.ml.feature.VectorIndexer
VectorIndexerModel - Class in org.apache.spark.ml.feature: :: Experimental :: Transform categorical features to use 0-based indices instead of their original values.
Vectors - Class in org.apache.spark.mllib.linalg
Vectors() - Constructor for class org.apache.spark.mllib.linalg.Vectors
VectorSlicer - Class in org.apache.spark.ml.feature: :: Experimental :: This class takes a feature vector and outputs a new feature vector with a subarray of the original features.
VectorSlicer(String) - Constructor for class org.apache.spark.ml.feature.VectorSlicer
VectorSlicer() - Constructor for class org.apache.spark.ml.feature.VectorSlicer
VectorTransformer - Interface in org.apache.spark.mllib.feature: :: DeveloperApi :: Trait for transformation of a vector
VectorUDT - Class in org.apache.spark.mllib.linalg: :: AlphaComponent ::
VectorUDT() - Constructor for class org.apache.spark.mllib.linalg.VectorUDT
version() - Method in class org.apache.spark.api.java.JavaSparkContext: The version of Spark on which this application is running.
version() - Method in class org.apache.spark.SparkContext: The version of Spark on which this application is running.
vertcat(Matrix[]) - Static method in class org.apache.spark.mllib.linalg.Matrices: Vertically concatenate a sequence of matrices.
vertexAttr(long) - Method in class org.apache.spark.graphx.EdgeTriplet: Get the vertex object for the given vertex in the edge.
VertexRDD<VD> - Class in org.apache.spark.graphx: Extends RDD[(VertexId, VD)] by ensuring that there is only one entry for each vertex and by pre-indexing the entries for fast, efficient joins.
VertexRDD(SparkContext, Seq<Dependency<?>>) - Constructor for class org.apache.spark.graphx.VertexRDD
VertexRDDImpl<VD> - Class in org.apache.spark.graphx.impl
vertices() - Method in class org.apache.spark.graphx.Graph: An RDD containing the vertices and their associated attributes.
vertices() - Method in class org.apache.spark.graphx.impl.GraphImpl
visit(int, int, String, String, String, String[]) - Method in class org.apache.spark.util.InnerClosureFinder
visitMethod(int, String, String, String, String[]) - Method in class org.apache.spark.util.InnerClosureFinder
visitMethod(int, String, String, String, String[]) - Method in class org.apache.spark.util.ReturnStatementFinder
vManifest() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
vocabSize() - Method in class org.apache.spark.ml.clustering.LDAModel
vocabSize() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
vocabSize() - Method in class org.apache.spark.mllib.clustering.EMLDAOptimizer
vocabSize() - Method in class org.apache.spark.mllib.clustering.LDAModel: Vocabulary size (number of terms or terms in the vocabulary)
vocabSize() - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
vocabulary() - Method in class org.apache.spark.ml.feature.CountVectorizerModel
VocabWord - Class in org.apache.spark.mllib.feature: Entry in vocabulary
VocabWord(String, int, int[], int[], int) - Constructor for class org.apache.spark.mllib.feature.VocabWord
VoidFunction<T> - Interface in org.apache.spark.api.java.function: A function with no return value.
VoidFunction2<T1,T2> - Interface in org.apache.spark.api.java.function: A two-argument function that takes arguments of type T1 and T2 with no return value.

W

w(boolean) - Method in class org.apache.spark.ml.param.BooleanParam: Creates a param pair with the given value (for Java).
w(List<Double>) - Method in class org.apache.spark.ml.param.DoubleArrayParam: Creates a param pair with a List of values (for Java and Python).
w(double) - Method in class org.apache.spark.ml.param.DoubleParam: Creates a param pair with the given value (for Java).
w(float) - Method in class org.apache.spark.ml.param.FloatParam: Creates a param pair with the given value (for Java).
w(List<Integer>) - Method in class org.apache.spark.ml.param.IntArrayParam: Creates a param pair with a List of values (for Java and Python).
w(int) - Method in class org.apache.spark.ml.param.IntParam: Creates a param pair with the given value (for Java).
w(long) - Method in class org.apache.spark.ml.param.LongParam: Creates a param pair with the given value (for Java).
w(T) - Method in class org.apache.spark.ml.param.Param: Creates a param pair with the given value (for Java).
w(List<String>) - Method in class org.apache.spark.ml.param.StringArrayParam: Creates a param pair with a List of values (for Java and Python).
waiter() - Method in class org.apache.spark.streaming.StreamingContext
weekofyear(Column) - Static method in class org.apache.spark.sql.functions: Extracts the week number as an integer from a given date/timestamp/string.
WeibullGenerator - Class in org.apache.spark.mllib.random: :: DeveloperApi :: Generates i.i.d.
WeibullGenerator(double, double) - Constructor for class org.apache.spark.mllib.random.WeibullGenerator
weightedFalsePositiveRate() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns weighted false positive rate
weightedFMeasure(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns weighted averaged f-measure
weightedFMeasure() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns weighted averaged f1-measure
weightedPrecision() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns weighted averaged precision
weightedRecall() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns weighted averaged recall (equals to precision, recall and f-measure)
weightedTruePositiveRate() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics: Returns weighted true positive rate (equals to precision, recall and f-measure)
weights() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
weights() - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassificationModel
weights() - Method in class org.apache.spark.ml.regression.LinearRegressionModel
weights() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
weights() - Method in class org.apache.spark.mllib.classification.SVMModel
weights() - Method in class org.apache.spark.mllib.clustering.ExpectationSum
weights() - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
weights() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
weights() - Method in class org.apache.spark.mllib.regression.LassoModel
weights() - Method in class org.apache.spark.mllib.regression.LinearRegressionModel
weights() - Method in class org.apache.spark.mllib.regression.RidgeRegressionModel
when(Column, Object) - Method in class org.apache.spark.sql.Column: Evaluates a list of conditions and returns one of multiple possible result expressions.
when(Column, Object) - Static method in class org.apache.spark.sql.functions: Evaluates a list of conditions and returns one of multiple possible result expressions.
where(Column) - Method in class org.apache.spark.sql.DataFrame: Filters rows using the given condition.
where(String) - Method in class org.apache.spark.sql.DataFrame: Filters rows using the given SQL expression.
wholeTextFiles(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext: Read a directory of text files from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI.
wholeTextFiles(String) - Method in class org.apache.spark.api.java.JavaSparkContext: Read a directory of text files from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI.
wholeTextFiles(String, int) - Method in class org.apache.spark.SparkContext: Read a directory of text files from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI.
Window - Class in org.apache.spark.sql.expressions: :: Experimental :: Utility functions for defining window in DataFrames.
window(Duration) - Method in class org.apache.spark.streaming.api.java.JavaDStream: Return a new DStream in which each RDD contains all the elements in seen in a sliding window of time over this DStream.
window(Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaDStream: Return a new DStream in which each RDD contains all the elements in seen in a sliding window of time over this DStream.
window(Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream which is computed based on windowed batches of this DStream.
window(Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream: Return a new DStream which is computed based on windowed batches of this DStream.
window(Duration) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD contains all the elements in seen in a sliding window of time over this DStream.
window(Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream: Return a new DStream in which each RDD contains all the elements in seen in a sliding window of time over this DStream.
WindowSpec - Class in org.apache.spark.sql.expressions: :: Experimental :: A window specification that defines the partitioning, ordering, and frame boundaries.
withColumn(String, Column) - Method in class org.apache.spark.sql.DataFrame: Returns a new DataFrame by adding a column or replacing the existing column that has the same name.
withColumnRenamed(String, String) - Method in class org.apache.spark.sql.DataFrame: Returns a new DataFrame with a column renamed.
withEdges(EdgeRDD<?>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
withEdges(EdgeRDD<?>) - Method in class org.apache.spark.graphx.VertexRDD: Prepares this VertexRDD for efficient joins with the given EdgeRDD.
withIndex(int) - Method in class org.apache.spark.ml.attribute.Attribute: Copy with a new index.
withIndex(int) - Method in class org.apache.spark.ml.attribute.BinaryAttribute
withIndex(int) - Method in class org.apache.spark.ml.attribute.NominalAttribute
withIndex(int) - Method in class org.apache.spark.ml.attribute.NumericAttribute
withIndex(int) - Static method in class org.apache.spark.ml.attribute.UnresolvedAttribute
withMax(double) - Method in class org.apache.spark.ml.attribute.NumericAttribute: Copy with a new max value.
withMean() - Method in class org.apache.spark.mllib.feature.StandardScalerModel
withMetadata(Metadata) - Method in class org.apache.spark.sql.types.MetadataBuilder: Include the content of an existing Metadata instance.
withMin(double) - Method in class org.apache.spark.ml.attribute.NumericAttribute: Copy with a new min value.
withName(String) - Method in class org.apache.spark.ml.attribute.Attribute: Copy with a new name.
withName(String) - Method in class org.apache.spark.ml.attribute.BinaryAttribute
withName(String) - Method in class org.apache.spark.ml.attribute.NominalAttribute
withName(String) - Method in class org.apache.spark.ml.attribute.NumericAttribute
withName(String) - Static method in class org.apache.spark.ml.attribute.UnresolvedAttribute
withNumValues(int) - Method in class org.apache.spark.ml.attribute.NominalAttribute: Copy with a new `numValues` and empty `values`.
withoutIndex() - Method in class org.apache.spark.ml.attribute.Attribute: Copy without the index.
withoutIndex() - Method in class org.apache.spark.ml.attribute.BinaryAttribute
withoutIndex() - Method in class org.apache.spark.ml.attribute.NominalAttribute
withoutIndex() - Method in class org.apache.spark.ml.attribute.NumericAttribute
withoutIndex() - Static method in class org.apache.spark.ml.attribute.UnresolvedAttribute
withoutMax() - Method in class org.apache.spark.ml.attribute.NumericAttribute: Copy without the max value.
withoutMin() - Method in class org.apache.spark.ml.attribute.NumericAttribute: Copy without the min value.
withoutName() - Method in class org.apache.spark.ml.attribute.Attribute: Copy without the name.
withoutName() - Method in class org.apache.spark.ml.attribute.BinaryAttribute
withoutName() - Method in class org.apache.spark.ml.attribute.NominalAttribute
withoutName() - Method in class org.apache.spark.ml.attribute.NumericAttribute
withoutName() - Static method in class org.apache.spark.ml.attribute.UnresolvedAttribute
withoutNumValues() - Method in class org.apache.spark.ml.attribute.NominalAttribute: Copy without the `numValues`.
withoutSparsity() - Method in class org.apache.spark.ml.attribute.NumericAttribute: Copy without the sparsity.
withoutStd() - Method in class org.apache.spark.ml.attribute.NumericAttribute: Copy without the standard deviation.
withoutSummary() - Method in class org.apache.spark.ml.attribute.NumericAttribute: Copy without summary statistics.
withoutValues() - Method in class org.apache.spark.ml.attribute.BinaryAttribute: Copy without the values.
withoutValues() - Method in class org.apache.spark.ml.attribute.NominalAttribute: Copy without the values.
withPosition(Option<Object>, Option<Object>) - Method in exception org.apache.spark.sql.AnalysisException
withSparsity(double) - Method in class org.apache.spark.ml.attribute.NumericAttribute: Copy with a new sparsity.
withStd(double) - Method in class org.apache.spark.ml.attribute.NumericAttribute: Copy with a new standard deviation.
withStd() - Method in class org.apache.spark.mllib.feature.StandardScalerModel
withValues(String, String) - Method in class org.apache.spark.ml.attribute.BinaryAttribute: Copy with new values.
withValues(String, String...) - Method in class org.apache.spark.ml.attribute.NominalAttribute: Copy with new values and empty `numValues`.
withValues(String[]) - Method in class org.apache.spark.ml.attribute.NominalAttribute: Copy with new values and empty `numValues`.
withValues(String, Seq<String>) - Method in class org.apache.spark.ml.attribute.NominalAttribute: Copy with new values and empty `numValues`.
word() - Method in class org.apache.spark.mllib.feature.VocabWord
Word2Vec - Class in org.apache.spark.ml.feature: :: Experimental :: Word2Vec trains a model of Map(String, Vector), i.e.
Word2Vec(String) - Constructor for class org.apache.spark.ml.feature.Word2Vec
Word2Vec() - Constructor for class org.apache.spark.ml.feature.Word2Vec
Word2Vec - Class in org.apache.spark.mllib.feature
Word2Vec() - Constructor for class org.apache.spark.mllib.feature.Word2Vec
Word2VecModel - Class in org.apache.spark.ml.feature: :: Experimental :: Model fitted by Word2Vec.
Word2VecModel - Class in org.apache.spark.mllib.feature
Word2VecModel(Map<String, float[]>) - Constructor for class org.apache.spark.mllib.feature.Word2VecModel
wordIndex() - Method in class org.apache.spark.mllib.feature.Word2VecModel
wordVectors() - Method in class org.apache.spark.mllib.feature.Word2VecModel
wrapperClass() - Static method in class org.apache.spark.serializer.JavaIterableWrapperSerializer
wrapRDD(RDD<Double>) - Method in class org.apache.spark.api.java.JavaDoubleRDD
wrapRDD(RDD<Tuple2<K, V>>) - Method in class org.apache.spark.api.java.JavaPairRDD
wrapRDD(RDD<T>) - Method in class org.apache.spark.api.java.JavaRDD
wrapRDD(RDD<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
wrapRDD(RDD<T>) - Method in class org.apache.spark.streaming.api.java.JavaDStream
wrapRDD(RDD<T>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
wrapRDD(RDD<Tuple2<K, V>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
writableWritableConverter() - Static method in class org.apache.spark.SparkContext
write(int) - Method in class org.apache.spark.io.SnappyOutputStreamWrapper
write(byte[]) - Method in class org.apache.spark.io.SnappyOutputStreamWrapper
write(byte[], int, int) - Method in class org.apache.spark.io.SnappyOutputStreamWrapper
write() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel: Returns a MLWriter instance for this ML instance.
write() - Method in class org.apache.spark.ml.classification.NaiveBayesModel
write() - Method in class org.apache.spark.ml.clustering.DistributedLDAModel
write() - Method in class org.apache.spark.ml.clustering.KMeansModel
write() - Method in class org.apache.spark.ml.clustering.LocalLDAModel
write() - Method in class org.apache.spark.ml.feature.ChiSqSelectorModel
write() - Method in class org.apache.spark.ml.feature.CountVectorizerModel
write() - Method in class org.apache.spark.ml.feature.IDFModel
write() - Method in class org.apache.spark.ml.feature.MinMaxScalerModel
write() - Method in class org.apache.spark.ml.feature.PCAModel
write() - Method in class org.apache.spark.ml.feature.StandardScalerModel
write() - Method in class org.apache.spark.ml.feature.StringIndexerModel
write() - Method in class org.apache.spark.ml.feature.VectorIndexerModel
write() - Method in class org.apache.spark.ml.feature.Word2VecModel
write() - Method in class org.apache.spark.ml.Pipeline
write() - Method in class org.apache.spark.ml.PipelineModel
write() - Method in class org.apache.spark.ml.recommendation.ALSModel
write() - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
write() - Method in class org.apache.spark.ml.regression.IsotonicRegressionModel
write() - Method in class org.apache.spark.ml.regression.LinearRegressionModel: Returns a MLWriter instance for this ML instance.
write() - Method in class org.apache.spark.ml.tuning.CrossValidator
write() - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
write() - Method in interface org.apache.spark.ml.util.MLWritable: Returns an MLWriter instance for this ML instance.
write(Kryo, Output, Iterable<?>) - Method in class org.apache.spark.serializer.JavaIterableWrapperSerializer
write() - Method in class org.apache.spark.sql.DataFrame: :: Experimental :: Interface for saving the content of the DataFrame out into external storage.
write(Row) - Method in class org.apache.spark.sql.sources.OutputWriter: Persists a single row.
write(int) - Method in class org.apache.spark.storage.TimeTrackingOutputStream
write(byte[]) - Method in class org.apache.spark.storage.TimeTrackingOutputStream
write(byte[], int, int) - Method in class org.apache.spark.storage.TimeTrackingOutputStream
write(ByteBuffer, long) - Method in class org.apache.spark.streaming.util.WriteAheadLog: Write the record to the log and return a record handle, which contains all the information necessary to read back the written record.
WriteAheadLog - Class in org.apache.spark.streaming.util: :: DeveloperApi :: This abstract class represents a write ahead log (aka journal) that is used by Spark Streaming to save the received data (by receivers) and associated metadata to a reliable storage, so that they can be recovered after driver failures.
WriteAheadLog() - Constructor for class org.apache.spark.streaming.util.WriteAheadLog
WriteAheadLogRecordHandle - Class in org.apache.spark.streaming.util: :: DeveloperApi :: This abstract class represents a handle that refers to a record written in a WriteAheadLog.
WriteAheadLogRecordHandle() - Constructor for class org.apache.spark.streaming.util.WriteAheadLogRecordHandle
writeAll(Iterator<T>, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializationStream
writeBytes() - Method in class org.apache.spark.status.api.v1.ShuffleWriteMetricDistributions
writeExternal(ObjectOutput) - Method in class org.apache.spark.serializer.JavaSerializer
writeExternal(ObjectOutput) - Method in class org.apache.spark.storage.BlockManagerId
writeExternal(ObjectOutput) - Method in class org.apache.spark.storage.StorageLevel
writeExternal(ObjectOutput) - Method in class org.apache.spark.streaming.flume.SparkFlumeEvent
writeInternal(InternalRow) - Method in class org.apache.spark.sql.sources.OutputWriter
writeKey(T, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializationStream: Writes the object representing the key of a key-value pair.
writeObject(T, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializationStream: The most general-purpose method to write an object.
writeRecords() - Method in class org.apache.spark.status.api.v1.ShuffleWriteMetricDistributions
writeTime() - Method in class org.apache.spark.status.api.v1.ShuffleWriteMetricDistributions
writeTime() - Method in class org.apache.spark.status.api.v1.ShuffleWriteMetrics
writeValue(T, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializationStream: Writes the object representing the value of a key-value pair.

Y

year(Column) - Static method in class org.apache.spark.sql.functions: Extracts the year as an integer from a given date/timestamp/string.

Z

zero() - Method in class org.apache.spark.Accumulable
zero(R) - Method in interface org.apache.spark.AccumulableParam: Return the "zero" (identity) value for an accumulator type, given its initial value.
zero(double) - Method in class org.apache.spark.AccumulatorParam.DoubleAccumulatorParam$
zero(float) - Method in class org.apache.spark.AccumulatorParam.FloatAccumulatorParam$
zero(int) - Method in class org.apache.spark.AccumulatorParam.IntAccumulatorParam$
zero(long) - Method in class org.apache.spark.AccumulatorParam.LongAccumulatorParam$
zero(int, int) - Static method in class org.apache.spark.mllib.clustering.ExpectationSum
zero(double) - Method in class org.apache.spark.SparkContext.DoubleAccumulatorParam$
zero(float) - Method in class org.apache.spark.SparkContext.FloatAccumulatorParam$
zero(int) - Method in class org.apache.spark.SparkContext.IntAccumulatorParam$
zero(long) - Method in class org.apache.spark.SparkContext.LongAccumulatorParam$
zero() - Method in class org.apache.spark.sql.expressions.Aggregator: A zero value for this aggregation.
ZERO() - Static method in class org.apache.spark.sql.types.Decimal
zero(Vector) - Method in class org.apache.spark.util.Vector.VectorAccumParam$
ZeroMQUtils - Class in org.apache.spark.streaming.zeromq
ZeroMQUtils() - Constructor for class org.apache.spark.streaming.zeromq.ZeroMQUtils
zeros(int, int) - Static method in class org.apache.spark.mllib.linalg.DenseMatrix: Generate a DenseMatrix consisting of zeros.
zeros(int, int) - Static method in class org.apache.spark.mllib.linalg.Matrices: Generate a Matrix consisting of zeros.
zeros(int) - Static method in class org.apache.spark.mllib.linalg.Vectors: Creates a vector of all zeros.
zeros(int) - Static method in class org.apache.spark.util.Vector
zeroTime() - Method in class org.apache.spark.streaming.dstream.DStream
zip(JavaRDDLike<U, ?>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Zips this RDD with another one, returning key-value pairs with the first element in each RDD, second element in each RDD, etc.
zip(RDD, ClassTag) - Method in class org.apache.spark.rdd.RDD: Zips this RDD with another one, returning key-value pairs with the first element in each RDD, second element in each RDD, etc.
zipPartitions(JavaRDDLike<U, ?>, FlatMapFunction2<Iterator<T>, Iterator, V>) - Method in interface org.apache.spark.api.java.JavaRDDLike: Zip this RDD's partitions with one (or more) RDD(s) and return a new RDD by applying a function to the zipped partitions.
zipPartitions(RDD, boolean, Function2<Iterator<T>, Iterator, Iterator<V>>, ClassTag, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD: Zip this RDD's partitions with one (or more) RDD(s) and return a new RDD by applying a function to the zipped partitions.
zipPartitions(RDD, Function2<Iterator<T>, Iterator, Iterator<V>>, ClassTag, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
zipPartitions(RDD, RDD<C>, boolean, Function3<Iterator<T>, Iterator, Iterator<C>, Iterator<V>>, ClassTag, ClassTag<C>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
zipPartitions(RDD, RDD<C>, Function3<Iterator<T>, Iterator, Iterator<C>, Iterator<V>>, ClassTag, ClassTag<C>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
zipPartitions(RDD, RDD<C>, RDD<D>, boolean, Function4<Iterator<T>, Iterator, Iterator<C>, Iterator<D>, Iterator<V>>, ClassTag, ClassTag<C>, ClassTag<D>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
zipPartitions(RDD, RDD<C>, RDD<D>, Function4<Iterator<T>, Iterator, Iterator<C>, Iterator<D>, Iterator<V>>, ClassTag, ClassTag<C>, ClassTag<D>, ClassTag<V>) - Method in class org.apache.spark.rdd.RDD
zipWithIndex() - Method in interface org.apache.spark.api.java.JavaRDDLike: Zips this RDD with its element indices.
zipWithIndex() - Method in class org.apache.spark.rdd.RDD: Zips this RDD with its element indices.
zipWithUniqueId() - Method in interface org.apache.spark.api.java.JavaRDDLike: Zips this RDD with generated unique Long ids.
zipWithUniqueId() - Method in class org.apache.spark.rdd.RDD: Zips this RDD with generated unique Long ids.

_

_1() - Method in class org.apache.spark.util.MutablePair
_2() - Method in class org.apache.spark.util.MutablePair
_rddInfoMap() - Method in class org.apache.spark.ui.storage.StorageListener
_sqlContext() - Method in class org.apache.spark.sql.SQLContext.implicits$
_sqlContext() - Method in class org.apache.spark.sql.SQLImplicits

A B C D E F G H I J K L M N O P Q R S T U V W Y Z _