public class JavaSQLContext
extends Object
Constructor and Description |
---|
JavaSQLContext(JavaSparkContext sparkContext) |
JavaSQLContext(SQLContext sqlContext) |
Modifier and Type | Method and Description |
---|---|
JavaSchemaRDD |
applySchema(JavaRDD<?> rdd,
Class<?> beanClass)
Applies a schema to an RDD of Java Beans.
|
JavaSchemaRDD |
createParquetFile(Class<?> beanClass,
String path,
boolean allowExisting,
org.apache.hadoop.conf.Configuration conf)
:: Experimental ::
Creates an empty parquet file with the schema of class
beanClass , which can be registered as
a table. |
JavaSchemaRDD |
jsonFile(String path)
Loads a JSON file (one object per line), returning the result as a
JavaSchemaRDD . |
JavaSchemaRDD |
jsonRDD(JavaRDD<String> json)
Loads an RDD[String] storing JSON objects (one object per record), returning the result as a
JavaSchemaRDD . |
JavaSchemaRDD |
parquetFile(String path)
Loads a parquet file, returning the result as a
JavaSchemaRDD . |
void |
registerRDDAsTable(JavaSchemaRDD rdd,
String tableName)
Registers the given RDD as a temporary table in the catalog.
|
JavaSchemaRDD |
sql(String sqlQuery)
Executes a query expressed in SQL, returning the result as a JavaSchemaRDD
|
SQLContext |
sqlContext() |
public JavaSQLContext(SQLContext sqlContext)
public JavaSQLContext(JavaSparkContext sparkContext)
public SQLContext sqlContext()
public JavaSchemaRDD sql(String sqlQuery)
public JavaSchemaRDD createParquetFile(Class<?> beanClass, String path, boolean allowExisting, org.apache.hadoop.conf.Configuration conf)
beanClass
, which can be registered as
a table. This registered table can be used as the target of future insertInto
operations.
JavaSQLContext sqlCtx = new JavaSQLContext(...)
sqlCtx.createParquetFile(Person.class, "path/to/file.parquet").registerAsTable("people")
sqlCtx.sql("INSERT INTO people SELECT 'michael', 29")
beanClass
- A java bean class object that will be used to determine the schema of the
parquet file.path
- The path where the directory containing parquet metadata should be created.
Data inserted into this table will also be stored at this location.allowExisting
- When false, an exception will be thrown if this directory already exists.conf
- A Hadoop configuration object that can be used to specific options to the parquet
output format.public JavaSchemaRDD applySchema(JavaRDD<?> rdd, Class<?> beanClass)
public JavaSchemaRDD parquetFile(String path)
JavaSchemaRDD
.public JavaSchemaRDD jsonFile(String path)
JavaSchemaRDD
.
It goes through the entire dataset once to determine the schema.
public JavaSchemaRDD jsonRDD(JavaRDD<String> json)
JavaSchemaRDD
.
It goes through the entire dataset once to determine the schema.
public void registerRDDAsTable(JavaSchemaRDD rdd, String tableName)