Download and Install Apache Spark to a Local Directory
install.spark downloads and installs Spark to a local directory if
it is not found. If SPARK_HOME is set in the environment, and that directory is found, that is
returned. The Spark version we use is the same as the SparkR version. Users can specify a desired
Hadoop version, the remote mirror site, and the directory where the package is installed locally.
Version of Hadoop to install. Default is
hadoopVersion = "without", "Hadoop free" build is installed. See "Hadoop Free" Build for more information. Other patched version names can also be used.
base URL of the repositories to use. The directory layout should follow Apache mirrors.
a local directory where Spark is installed. The directory contains version-specific folders of Spark packages. Default is path to the cache directory:
Mac OS X:
$XDG_CACHE_HOMEif defined, otherwise
TRUE, download and overwrite the existing tar file in localDir and force re-install Spark (in case the local directory or file is corrupted)
The full url of remote file is inferred from
mirrorUrl specifies the remote path to a Spark folder. It is followed by a subfolder
named after the Spark version (that corresponds to SparkR), and then the tar filename.
The filename is composed of four parts, i.e. [Spark version]-bin-[Hadoop version].tgz.
For example, the full path for a Spark 3.3.1 package from
https://archive.apache.org has path:
hadoopVersion = "without", [Hadoop version] in the filename is then
See available Hadoop versions: Apache Spark