-
Notifications
You must be signed in to change notification settings - Fork 32
Closed
Description
Hi, cool extension, I'm just playing around with it, but I've gotten the same error, described in #68 , specifically 68#issuecomment-2884770715
I was really confused until I chanced upon that comment. When I found I could get around it on every interaction with:
DROP EXTENSION IF EXISTS pg_parquet;
CREATE EXTENSION if not exists pg_parquet;
But this does make me worry about race conditions.
Here's how I'm building postgres with your pg_parquet extension:
FROM postgres:17
RUN echo "invaliate build 898908900"
# Install build dependencies
RUN apt-get update
RUN apt-get install -y build-essential
RUN apt-get install -y git
RUN apt-get install -y curl
RUN apt-get install -y pkg-config
RUN apt-get install -y libssl-dev
RUN apt-get install -y postgresql-server-dev-17
RUN rm -rf /var/lib/apt/lists/*
# Install Rust
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
ENV PATH="/root/.cargo/bin:${PATH}"
# Install cargo-pgrx with the correct version
RUN cargo install cargo-pgrx --version 0.15.0 --locked
# Configure pgrx for PostgreSQL 17
RUN cargo pgrx init --pg17 $(which pg_config)
# Clone pg_parquet repository
RUN git clone https://github.com/crunchydata/pg_parquet.git /tmp/pg_parquet
# Build pg_parquet extension
WORKDIR /tmp/pg_parquet
RUN cargo pgrx install --release
# Clean up build dependencies
# RUN apt-get remove -y build-essential
# RUN apt-get remove -y git
# RUN apt-get remove -y curl
# RUN apt-get remove -y pkg-config
# RUN apt-get remove -y libssl-dev
# RUN apt-get autoremove -y
# RUN rm -rf /tmp/pg_parquet
# RUN rm -rf /root/.cargo/registry
# RUN rm -rf /root/.cargo/git
# Create directory for parquet files in a writable location
RUN mkdir -p /var/lib/postgresql/parquet
# Set permissions
RUN chown -R postgres:postgres /var/lib/postgresql/parquet
RUN mkdir ~/.pgrx/data-15/
RUN mkdir ~/.pgrx/data-17/
RUN echo "shared_preload_libraries = 'pg_parquet'" >> ~/.pgrx/data-15/postgresql.conf
RUN echo "shared_preload_libraries = 'pg_parquet'" >> ~/.pgrx/data-17/postgresql.conf
# Copy initialization script
COPY init.sql /docker-entrypoint-initdb.d/
The zipfile contains the parquet file that I'm loading
20250808_155231_00159_p2a26_7985abc6-7137-4a02-959c-c67d52adaa47.zip
So, in my IDE, if the database connection times out, or I close it, and I run:
COPY gold_attendance.dim_academic_year FROM '/var/lib/postgresql/parquet/gold/70385/attendance/dims/academic_year/parquet/20250808_155231_00159_p2a26_7985abc6-7137-4a02-959c-c67d52adaa47' (format 'parquet', match_by 'name');
Every time, I will receive an error message:
[2025-08-11 18:31:50] [22023] ERROR: COPY format "parquet" not recognized
[2025-08-11 18:31:50] Position: 190
But it's a success every time if I run:
DROP EXTENSION IF EXISTS pg_parquet;
CREATE EXTENSION if not exists pg_parquet;
COPY gold_attendance.dim_academic_year FROM '/var/lib/postgresql/parquet/gold/70385/attendance/dims/academic_year/parquet/20250808_155231_00159_p2a26_7985abc6-7137-4a02-959c-c67d52adaa47' (format 'parquet', match_by 'name');
And if I run select * from pg_catalog.pg_extension; I get the following result:
| oid | extname | extowner | extnamespace | extrelocatable | extversion | extconfig | extcondition |
|---|---|---|---|---|---|---|---|
| 13569 | plpgsql | 10 | 11 | false | 1.0 | null | null |
| 16698 | pg_parquet | 10 | 2200 | false | 0.4.1 | null | null |
Metadata
Metadata
Assignees
Labels
No labels