Spark Release 3.0.2
Spark 3.0.2 is a maintenance release containing stability fixes. This release is based on the branch-3.0 maintenance branch of Spark. We strongly recommend all 3.0 users to upgrade to this stable release.
Notable changes
- [SPARK-31511]: Make BytesToBytesMap iterator() thread-safe
- [SPARK-32635]: When pyspark.sql.functions.lit() function is used with dataframe cache, it returns wrong result
- [SPARK-32753]: Deduplicating and repartitioning the same column create duplicate rows with AQE
- [SPARK-32764]: compare of -0.0 < 0.0 return true
- [SPARK-32840]: Invalid interval value can happen to be just adhesive with the unit
- [SPARK-32908]: percentile_approx() returns incorrect results
- [SPARK-33019]: Use spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version=1 by default
- [SPARK-33183]: Bug in optimizer rule EliminateSorts
- [SPARK-33260]: SortExec produces incorrect results if sortOrder is a Stream
- [SPARK-33290]: SPARK-33507 REFRESH TABLE should invalidate cache even though the table itself may not be cached
- [SPARK-33358]: Spark SQL CLI command processing loop can’t exit while one comand fail
- [SPARK-33404]: “date_trunc” expression returns incorrect results
- [SPARK-33435]: SPARK-33507 DSv2: REFRESH TABLE should invalidate caches
- [SPARK-33591]: NULL is recognized as the “null” string in partition specs
- [SPARK-33593]: Vector reader got incorrect data with binary partition value
- [SPARK-33726]: Duplicate field names causes wrong answers during aggregation
- [SPARK-33819]: SingleFileEventLogFileReader/RollingEventLogFilesFileReader should be
package private
- [SPARK-33950]: ALTER TABLE .. DROP PARTITION doesn’t refresh cache
- [SPARK-34011]: ALTER TABLE .. RENAME TO PARTITION doesn’t refresh cache
- [SPARK-34027]: ALTER TABLE .. RECOVER PARTITIONS doesn’t refresh cache
- [SPARK-34055]: ALTER TABLE .. ADD PARTITION doesn’t refresh cache
- [SPARK-34187]: Use available offset range obtained during polling when checking offset validation
- [SPARK-34212]: For parquet table, after changing the precision and scale of decimal type in hive, spark reads incorrect value
- [SPARK-34213]: LOAD DATA doesn’t refresh v1 table cache
- [SPARK-34229]: Avro should read decimal values with the file schema
- [SPARK-34262]: ALTER TABLE .. SET LOCATION doesn’t refresh v1 table cache
Dependency Changes
While being a maintence release we did still upgrade some dependencies in this release they are:
Known issues
You can consult JIRA for the detailed changes.
We would like to acknowledge all community members for contributing patches to this release.
Spark News Archive