Add a file to be downloaded with this Spark job on every node.
The path passed can be either a local file, a file in HDFS
(or other Hadoop-supported filesystems), or an HTTP, HTTPS or
To access the file in Spark jobs, use SparkFiles.get() with the
filename to find its download location.
A directory can be given if the recursive option is set to True.
Currently directories are only supported for Hadoop-supported filesystems.
A path can be added only once. Subsequent additions of the same path are ignored.
>>> from pyspark import SparkFiles
>>> path = os.path.join(tempdir, "test.txt")
>>> with open(path, "w") as testFile:
... _ = testFile.write("100")
>>> def func(iterator):
... with open(SparkFiles.get("test.txt")) as testFile:
... fileVal = int(testFile.readline())
... return [x * fileVal for x in iterator]
>>> sc.parallelize([1, 2, 3, 4]).mapPartitions(func).collect()
[100, 200, 300, 400]