Description
Hello,
Recently, the pyiceberg 0.6.0 version was released which allows writing iceberg tables without needing tools like Spark and Trino.
I was about to write a custom plugin to implement the writing feature, however, I see that when using the external materialization with a custom plugin, first the outputted data is stored locally and then is read and ingested to the final source, however for Iceberg and Delta is does not seem to be a good solution. Would be good instead of storing the data on disk, simply load an Arrow Dataframe and then write to the final destination (e.g., s3 in Iceberg format).
I saw this thread: #332 (comment), so I would like to ask you if there is any ETA to implement this feature. It would be an amazing feature to even use for production workloads with a Data Lakehouse architecture.
This explains well what needs to be fixed to use the iceberg writer in the best way possible: #284 (comment)