[Question] pyiceberg 0.6.0

Hello,

Recently, the pyiceberg 0.6.0 version was released which allows writing iceberg tables without needing tools like Spark and Trino.

I was about to write a custom plugin to implement the writing feature, however, I see that when using the external materialization with a custom plugin, first the outputted data is stored locally and then is read and ingested to the final source, however for Iceberg and Delta is does not seem to be a good solution. Would be good instead of storing the data on disk, simply load an Arrow Dataframe and then write to the final destination (e.g., s3 in Iceberg format).

I saw this thread: [https://github.com/duckdb/dbt-duckdb/pull/332#issuecomment-1963017721](https://github.com/duckdb/dbt-duckdb/pull/332#issuecomment-1963017721), so I would like to ask you if there is any ETA to implement this feature. It would be an amazing feature to even use for production workloads with a Data Lakehouse architecture.

This explains well what needs to be fixed to use the iceberg writer in the best way possible: [https://github.com/duckdb/dbt-duckdb/pull/284#issuecomment-1914351933](https://github.com/duckdb/dbt-duckdb/pull/284#issuecomment-1914351933)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Question] pyiceberg 0.6.0 #350

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Question] pyiceberg 0.6.0 #350

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions