Commit a7e65c9
committed
[3.2][Kernel][Defaults] Handle legacy map types in Parquet files (#3097)
Currently, Kernel's Parquet reader explicitly looks for the `key_value`
repeated group under the Parquet map type, but the older versions of
Parquet writers wrote any name for the repeated group. Instead of
looking for the explicit `key_value` element, fetch the first element in
the list. See
[here](https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#maps)
for more details.
The
[test](https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetThriftCompatibilitySuite.scala#L29)
and sample file written by legacy writers are taken from Apache Spark™.
Some columns (arrays with 2-level encoding, another legacy format) from
the test file are currently not supported. I will follow up with a
separate PR. It involves bit refactoring on the ArrayColumnReader.1 parent f598205 commit a7e65c9
File tree
5 files changed
+89
-23
lines changed- kernel/kernel-defaults/src
- main/java/io/delta/kernel/defaults/internal/parquet
- test
- resources/parquet
- scala/io/delta/kernel/defaults/internal/parquet
5 files changed
+89
-23
lines changedLines changed: 2 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
77 | 77 | | |
78 | 78 | | |
79 | 79 | | |
80 | | - | |
81 | | - | |
| 80 | + | |
| 81 | + | |
82 | 82 | | |
83 | 83 | | |
84 | 84 | | |
| |||
Lines changed: 12 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
| 27 | + | |
| 28 | + | |
27 | 29 | | |
28 | 30 | | |
29 | 31 | | |
| |||
57 | 59 | | |
58 | 60 | | |
59 | 61 | | |
60 | | - | |
61 | | - | |
62 | | - | |
63 | | - | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
64 | 72 | | |
65 | 73 | | |
66 | 74 | | |
| |||
Binary file not shown.
Lines changed: 57 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
135 | 135 | | |
136 | 136 | | |
137 | 137 | | |
138 | | - | |
| 138 | + | |
139 | 139 | | |
140 | 140 | | |
141 | 141 | | |
142 | 142 | | |
143 | 143 | | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
144 | 200 | | |
Lines changed: 18 additions & 16 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
207 | 207 | | |
208 | 208 | | |
209 | 209 | | |
210 | | - | |
| 210 | + | |
211 | 211 | | |
212 | 212 | | |
213 | | - | |
214 | | - | |
215 | | - | |
216 | | - | |
217 | | - | |
| 213 | + | |
| 214 | + | |
218 | 215 | | |
219 | 216 | | |
220 | | - | |
| 217 | + | |
221 | 218 | | |
222 | 219 | | |
223 | 220 | | |
224 | 221 | | |
225 | 222 | | |
226 | 223 | | |
227 | | - | |
| 224 | + | |
228 | 225 | | |
229 | | - | |
230 | | - | |
| 226 | + | |
| 227 | + | |
231 | 228 | | |
232 | 229 | | |
233 | 230 | | |
| |||
238 | 235 | | |
239 | 236 | | |
240 | 237 | | |
241 | | - | |
242 | | - | |
243 | | - | |
244 | | - | |
245 | | - | |
246 | | - | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
247 | 249 | | |
248 | 250 | | |
249 | 251 | | |
| |||
0 commit comments