Spark Release 3.2.1

Spark 3.2.1 is a maintenance release containing stability fixes. This release is based on the branch-3.2 maintenance branch of Spark. We strongly recommend all 3.2 users to upgrade to this stable release.

Notable changes

  • [SPARK-33277]: Python/Pandas UDF right after off-heap vectorized reader could cause executor crash.
  • [SPARK-34399]: Add file commit time to metrics and shown in SQL Tab UI
  • [SPARK-35714]: Bug fix for deadlock during the executor shutdown
  • [SPARK-36754]: array_intersect should handle Double.NaN and Float.NaN
  • [SPARK-37001]: Disable two level of map for final hash aggregation by default
  • [SPARK-37023]: Avoid fetching merge status when shuffleMergeEnabled is false for a shuffleDependency during retry
  • [SPARK-37088]: Python UDF after off-heap vectorized reader can cause crash due to use-after-free in writer thread
  • [SPARK-37202]: Temp view didn’t collect temp function that registered with catalog API
  • [SPARK-37208]: Support mapping Spark gpu/fpga resource types to custom YARN resource type
  • [SPARK-37214]: Fail query analysis earlier with invalid identifiers
  • [SPARK-37392]: Fix the performance bug when inferring constraints for Generate
  • [SPARK-37695]: Skip diagnosis ob merged blocks from push-based shuffle
  • [SPARK-37705]: Write session time zone in the Parquet file metadata so that rebase can use it instead of JVM timezone
  • [SPARK-37957]: Deterministic flag is not handled for V2 functions

Dependency Changes

While being a maintence release we did still upgrade some dependencies in this release they are:

You can consult JIRA for the detailed changes.

We would like to acknowledge all community members for contributing patches to this release.

Spark News Archive