Commit 4f9c8b9
committed
[Spark] Upgrade Spark dependency to 3.5.0 in Delta-Spark
The following are the changes needed:
* PySpark 3.5 has deprecated the support for Python 3.7. This required changes to Delta test infra to install the appropriate Python version and other packages. The `Dockerfile` used for running tests is also updated to have required Python version and packages and uses the same base image as PySpark test infra in Apache Spark.
* `StructType.toAttributes` and `StructType.fromAttributes` methods are moved into a utility class `DataTypeUtils`.
* The `iceberg` module is disabled as there is no released version of `iceberg` that works with Spark 3.5 yet
* Remove the URI path hack used in `DMLWithDeletionVectorsHelper` to get around a bug in Spark 3.4.
* Remove unrelated tutorial in `delta/examples/tutorials/saiseu19`
* Test failure fixes
* `org.apache.spark.sql.delta.DeltaHistoryManagerSuite` - Error message has changed
* `org.apache.spark.sql.delta.DeltaOptionSuite` - Parquet file name using the LZ4 code has changed due to a apache/parquet-java#1000 in `parquet-mr` dependency.
* `org.apache.spark.sql.delta.deletionvectors.DeletionVectorsSuite` - Parquet now generates `row-index` whenever `_metadata` column is selected, however Spark 3.5 has a bug where a row group containing more than 2bn rows fails. For now don't return any `row-index` column in `_metadata` by overriding the `metadataSchemaFields: Seq[StructField]` in `DeltaParquetFileFormat`.
* `org.apache.spark.sql.delta.perf.OptimizeMetadataOnlyDeltaQuerySuite`: A behavior change by apache/spark#40922. In Spark plans a new function called `ToPrettyString` is used instead of `cast(aggExpr To STRING)` in when `Dataset.show()` usage.
* `org.apache.spark.sql.delta.DeltaCDCStreamDeletionVectorSuite` and `org.apache.spark.sql.delta.DeltaCDCStreamSuite`: Regression in Spark 3.5 RC fixed by apache/spark#42774 before the Spark 3.5 release
Closes #1986
GitOrigin-RevId: b0e4a81b608a857e45ecba71b070309347616a301 parent bcd1867 commit 4f9c8b9
File tree
58 files changed
+303
-1846
lines changed- .github/workflows
- dev
- docs
- examples/tutorials/saiseu19
- img
- python/delta/tests
- spark/src
- main/scala/org/apache/spark/sql/delta
- catalog
- commands
- cdc
- expressions
- files
- perf
- schema
- util
- test/scala/org/apache/spark/sql/delta
- deletionvectors
- test
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
58 files changed
+303
-1846
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
46 | 46 | | |
47 | 47 | | |
48 | 48 | | |
49 | | - | |
50 | | - | |
51 | | - | |
52 | | - | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
53 | 53 | | |
54 | 54 | | |
55 | | - | |
| 55 | + | |
56 | 56 | | |
57 | 57 | | |
58 | 58 | | |
59 | 59 | | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
60 | 64 | | |
61 | 65 | | |
62 | 66 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
13 | 13 | | |
14 | 14 | | |
15 | 15 | | |
| 16 | + | |
16 | 17 | | |
17 | | - | |
18 | | - | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
19 | 28 | | |
20 | | - | |
21 | | - | |
22 | 29 | | |
23 | 30 | | |
24 | 31 | | |
25 | 32 | | |
26 | | - | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
27 | 42 | | |
28 | | - | |
| 43 | + | |
29 | 44 | | |
30 | 45 | | |
31 | 46 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
35 | 35 | | |
36 | 36 | | |
37 | 37 | | |
38 | | - | |
| 38 | + | |
39 | 39 | | |
40 | 40 | | |
41 | 41 | | |
| |||
115 | 115 | | |
116 | 116 | | |
117 | 117 | | |
| 118 | + | |
| 119 | + | |
118 | 120 | | |
119 | 121 | | |
120 | 122 | | |
| |||
322 | 324 | | |
323 | 325 | | |
324 | 326 | | |
| 327 | + | |
| 328 | + | |
325 | 329 | | |
326 | 330 | | |
327 | 331 | | |
| |||
451 | 455 | | |
452 | 456 | | |
453 | 457 | | |
454 | | - | |
| 458 | + | |
455 | 459 | | |
456 | 460 | | |
457 | 461 | | |
| |||
1081 | 1085 | | |
1082 | 1086 | | |
1083 | 1087 | | |
1084 | | - | |
| 1088 | + | |
1085 | 1089 | | |
1086 | 1090 | | |
1087 | 1091 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
22 | | - | |
| 22 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
142 | 142 | | |
143 | 143 | | |
144 | 144 | | |
145 | | - | |
| 145 | + | |
146 | 146 | | |
147 | 147 | | |
148 | 148 | | |
| |||
0 commit comments