Spark Release 0.7.2

Spark 0.7.2 is a maintenance release that contains multiple bug fixes and improvements. You can download it as a source package (4 MB tar.gz) or get prebuilt packages for Hadoop 1 / CDH3 or CDH 4 (61 MB tar.gz).

We recommend that all users update to this maintenance release.

The fixes and improvements in this version include:

Scala version updated to 2.9.3.
Several improvements to Bagel, including performance fixes and a configurable storage level.
New API methods: subtractByKey, foldByKey, mapWith, filterWith, foreachPartition, and others.
A new metrics reporting interface, SparkListener, to collect information about each computation stage: task lengths, bytes shuffled, etc.
Several new examples using the Java API, including K-means and computing pi.
Support for launching multiple worker instances per host in the standalone mode.
Various bug fixes across the board.

The following people contributed to this release:

Jey Kottalam (Maven build, bug fixes, EC2 scripts, packaging the release)
Andrew Ash (bug fixes, docs)
Andrey Kouznetsov (bug fixes)
Andy Konwinski (docs)
Charles Reiss (bug fixes)
Christoph Grothaus (bug fixes)
Erik van Oosten (bug fixes)
Giovanni Delussu (bug fixes)
Hiral Patel (bug fixes)
Holden Karau (error reporting, EC2 scripts)
Imran Rashid (metrics reporting system)
Josh Rosen (EC2 scripts)
Mark Hamstra (new API methods, tests)
Mikhail Bautin (build)
Mosharaf Chowdhury (bug fixes)
Nick Pentreath (Bagel, examples)
Patrick Wendell (bug fixes)
Reynold Xin (bug fixes)
Stephen Haberman (bug fixes, tests, subtractByKey)
Kalpit Shah (build, multiple workers per host)
Mike Potts (run scripts)
Matei Zaharia (Bagel, bug fixes, build)

We thank everyone who helped with this release, and hope to see more contributions from you in the future!

Spark News Archive

Latest News

Preview release of Spark 4.2.0 (Feb 08, 2026)
Spark 4.0.2 released (Feb 05, 2026)
Spark 3.5.8 released (Jan 15, 2026)
Preview release of Spark 4.2.0 (Jan 11, 2026)