Spark Release 0.6.2

Spark 0.6.2 is a maintenance release that contains several bug fixes and usability improvements. You can download it as a source package (2.5 MB tar.gz) or prebuilt package (48 MB tar.gz).

We recommend that all Spark 0.6 users update to this maintenance release.

The fixes and improvements in this version include:

  • A number of fault tolerance fixes regarding detecting dead nodes, handling missing map output fetches, and allowing failed nodes to rejoin the cluster
  • Documentation fixes that clarify the configuration for the standalone mode and improve the quick start instructions
  • A connection reuse bug fix that improves shuffle performance
  • Support for launching a cluster across multiple availability zones in the EC2 scripts
  • Support for deleting security groups when an EC2 cluster is terminated
  • Improved memory configuration for the standalone deploy cluster daemons: instead of using SPARK_MEM for their memory, which often led people to give them much more memory than they intended, they now use a separate variable, SPARK_DAEMON_MEMORY, with a reasonable default of 512 MB
  • Fixes to the Windows run scripts for Spark
  • Better detection of a machine's external IP address
  • Several small optimizations and bug fixes

In total, eleven people contributed to this release:

  • Stephen Haberman (bug fix)
  • Shane Huang (shuffle fix)
  • Fernand Pajot (bug fix)
  • Andrew Psaltis (bug fix)
  • Imran Rashid (standalone cluster, bug fix)
  • Charles Reiss (fault recovery fixes, node re-registration, tests)
  • Josh Rosen (fault recovery, Java API fixes, deploy scripts)
  • Peter Sankauskas (EC2 scripts)
  • Lee Moon Soo (bug fix)
  • Patrick Wendell (bugs, docs)
  • Matei Zaharia (fault recovery, UI, docs, bug fixes)

Spark News Archive