Skip to content

Conversation

@aykut-bozkurt
Copy link
Member

@aykut-bozkurt aykut-bozkurt commented Jul 27, 2025

Now it is possible to use programs with COPY table TO/FROM PROGRAM '...' WITH (format parquet) syntax.

e.g.

pg_parquet=# CREATE TABLE test_table(a int);
CREATE TABLE

pg_parquet=# INSERT INTO test_table SELECT i FROM generate_series(1,10) i;
INSERT 0 10

pg_parquet=# COPY test_table TO PROGRAM 'cat > /tmp/test.parquet' WITH (format parquet);
COPY 10

pg_parquet=# COPY test_table FROM PROGRAM 'cat /tmp/test.parquet' WITH (format parquet);
COPY 10

Similar to how we support COPY TO/FROM stdin/stdout WITH(format parquet), we use a temp file as intermediate file and finally we copy from/to program's stdout/stdin by piping the temp file.

Closes #147.

@codecov
Copy link

codecov bot commented Jul 27, 2025

Codecov Report

❌ Patch coverage is 97.05882% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 91.03%. Comparing base (0d80fcd) to head (36bf1c8).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/pgrx_tests/copy_program.rs 96.77% 7 Missing ⚠️
src/arrow_parquet/uri_utils.rs 95.65% 2 Missing ⚠️
src/parquet_copy_hook/copy_to_program.rs 88.88% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #146      +/-   ##
==========================================
+ Coverage   90.84%   91.03%   +0.18%     
==========================================
  Files          92       95       +3     
  Lines       10442    10759     +317     
==========================================
+ Hits         9486     9794     +308     
- Misses        956      965       +9     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.


// permission check is not needed for stdin/out
if uri_info.stdio_tmp_fd.is_some() {
if uri_info.stdio_tmp_fd.is_some() && !is_program {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when would you have stdin + program?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is fine (we use tmp file as intermediate step for both stdio and program) let me rename it to reduce confusion.

pipe().unwrap_or_else(|e| panic!("Failed to create command pipe: {e}"));

#[cfg(unix)]
let mut command = Command::new("/bin/sh")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use the same functions that COPY uses?

@aykut-bozkurt aykut-bozkurt force-pushed the aykut/program branch 3 times, most recently from 78499b6 to 2778946 Compare August 10, 2025 18:46
Copy link
Collaborator

@pgguru pgguru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good overall; one q, mainly to fill in some of my Rust gaps.


// Write temp file to pipe file
std::io::copy(&mut file, &mut pipe_file)
.unwrap_or_else(|e| panic!("Failed to copy file to command stdin: {e}"));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not great at Rust, so there may be some semantic misunderstanding here, but how does this differ from .expect(), just no access to the underlying value for the message?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::io::copy returns Result. We want to extract error to also print it via unwrap_or_else. We cannot do that with expect.

@aykut-bozkurt aykut-bozkurt enabled auto-merge (squash) September 19, 2025 16:49
@aykut-bozkurt aykut-bozkurt merged commit 7060d99 into main Sep 19, 2025
14 checks passed
@aykut-bozkurt aykut-bozkurt deleted the aykut/program branch September 19, 2025 16:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support COPY TO/FROM PROGRAM

4 participants