SQL Syntax

Spark SQL is Apache Spark’s module for working with structured data. The SQL Syntax section describes the SQL syntax in detail along with usage examples when applicable. This document provides a list of Data Definition and Data Manipulation Statements, as well as Data Retrieval and Auxiliary Statements.

DDL Statements

Data Definition Statements are used to create or modify the structure of database objects in a database. Spark SQL supports the following Data Definition Statements:

DML Statements

Data Manipulation Statements are used to add, change, or delete data. Spark SQL supports the following Data Manipulation Statements:

Data Retrieval Statements

Spark supports SELECT statement that is used to retrieve rows from one or more tables according to the specified clauses. The full syntax and brief description of supported clauses are explained in SELECT section. The SQL statements related to SELECT are also included in this section. Spark also provides the ability to generate logical and physical plan for a given query using EXPLAIN statement.

Auxiliary Statements