Package org.apache.spark.sql.api.r
Class SQLUtils
Object
org.apache.spark.sql.api.r.SQLUtils
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionstatic ArrayType
createArrayType
(Column column) createDF
(RDD<byte[]> rdd, StructType schema, SparkSession sparkSession) static StructField
createStructField
(String name, String dataType, boolean nullable) static StructType
createStructType
(scala.collection.Seq<StructField> fields) dapply
(Dataset<Row> df, byte[] func, byte[] packageNames, Object[] broadcastVars, StructType schema) The helper function for dapply() on R side.static Object[][]
static JavaRDD<byte[]>
dfToRowRDD
(Dataset<Row> df) gapply
(RelationalGroupedDataset gd, byte[] func, byte[] packageNames, Object[] broadcastVars, StructType schema) The helper function for gapply() on R side.static JavaSparkContext
getJavaSparkContext
(SparkSession spark) static SparkSession
getOrCreateSparkSession
(JavaSparkContext jsc, Map<Object, Object> sparkConfigMap, boolean enableHiveSupport) getSessionConf
(SparkSession spark) static String[]
getTableNames
(SparkSession sparkSession, String databaseName) static org.slf4j.Logger
static void
org$apache$spark$internal$Logging$$log__$eq
(org.slf4j.Logger x$1) static JavaRDD<byte[]>
readArrowStreamFromFile
(SparkSession sparkSession, String filename) R callable function to read a file in Arrow stream format and create anRDD
using each serialized ArrowRecordBatch as a partition.static Object
readSqlObject
(DataInputStream dis, char dataType) static StructType
static void
setSparkContextSessionConf
(SparkSession spark, Map<Object, Object> sparkConfigMap) toDataFrame
(JavaRDD<byte[]> arrowBatchRDD, StructType schema, SparkSession sparkSession) R callable function to create aDataFrame
from aJavaRDD
of serialized ArrowRecordBatches.static boolean
writeSqlObject
(DataOutputStream dos, Object obj)
-
Constructor Details
-
SQLUtils
public SQLUtils()
-
-
Method Details
-
getOrCreateSparkSession
public static SparkSession getOrCreateSparkSession(JavaSparkContext jsc, Map<Object, Object> sparkConfigMap, boolean enableHiveSupport) -
setSparkContextSessionConf
public static void setSparkContextSessionConf(SparkSession spark, Map<Object, Object> sparkConfigMap) -
getSessionConf
-
getJavaSparkContext
-
createStructType
-
createStructField
-
createDF
-
dfToRowRDD
-
SERIALIZED_R_DATA_SCHEMA
-
dapply
public static Dataset<Row> dapply(Dataset<Row> df, byte[] func, byte[] packageNames, Object[] broadcastVars, StructType schema) The helper function for dapply() on R side.- Parameters:
df
- (undocumented)func
- (undocumented)packageNames
- (undocumented)broadcastVars
- (undocumented)schema
- (undocumented)- Returns:
- (undocumented)
-
gapply
public static Dataset<Row> gapply(RelationalGroupedDataset gd, byte[] func, byte[] packageNames, Object[] broadcastVars, StructType schema) The helper function for gapply() on R side.- Parameters:
gd
- (undocumented)func
- (undocumented)packageNames
- (undocumented)broadcastVars
- (undocumented)schema
- (undocumented)- Returns:
- (undocumented)
-
dfToCols
-
readSqlObject
-
writeSqlObject
-
getTableNames
-
createArrayType
-
readArrowStreamFromFile
R callable function to read a file in Arrow stream format and create anRDD
using each serialized ArrowRecordBatch as a partition.- Parameters:
sparkSession
- (undocumented)filename
- (undocumented)- Returns:
- (undocumented)
-
toDataFrame
public static Dataset<Row> toDataFrame(JavaRDD<byte[]> arrowBatchRDD, StructType schema, SparkSession sparkSession) R callable function to create aDataFrame
from aJavaRDD
of serialized ArrowRecordBatches.- Parameters:
arrowBatchRDD
- (undocumented)schema
- (undocumented)sparkSession
- (undocumented)- Returns:
- (undocumented)
-
org$apache$spark$internal$Logging$$log_
public static org.slf4j.Logger org$apache$spark$internal$Logging$$log_() -
org$apache$spark$internal$Logging$$log__$eq
public static void org$apache$spark$internal$Logging$$log__$eq(org.slf4j.Logger x$1)
-