Skip to content

Have to drop/create extension on every run #149

@bryanfaria

Description

@bryanfaria

Hi, cool extension, I'm just playing around with it, but I've gotten the same error, described in #68 , specifically 68#issuecomment-2884770715

I was really confused until I chanced upon that comment. When I found I could get around it on every interaction with:

DROP EXTENSION IF EXISTS pg_parquet;
CREATE EXTENSION if not exists pg_parquet;

But this does make me worry about race conditions.

Here's how I'm building postgres with your pg_parquet extension:

FROM postgres:17

RUN echo "invaliate build 898908900"

# Install build dependencies
RUN apt-get update
RUN apt-get install -y build-essential
RUN apt-get install -y git
RUN apt-get install -y curl
RUN apt-get install -y pkg-config
RUN apt-get install -y libssl-dev
RUN apt-get install -y postgresql-server-dev-17
RUN rm -rf /var/lib/apt/lists/*

# Install Rust
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
ENV PATH="/root/.cargo/bin:${PATH}"

# Install cargo-pgrx with the correct version
RUN cargo install cargo-pgrx --version 0.15.0 --locked

# Configure pgrx for PostgreSQL 17
RUN cargo pgrx init --pg17 $(which pg_config)

# Clone pg_parquet repository
RUN git clone https://github.com/crunchydata/pg_parquet.git /tmp/pg_parquet

# Build pg_parquet extension
WORKDIR /tmp/pg_parquet

RUN cargo pgrx install --release

# Clean up build dependencies
# RUN apt-get remove -y build-essential
# RUN apt-get remove -y git
# RUN apt-get remove -y curl
# RUN apt-get remove -y pkg-config
# RUN apt-get remove -y libssl-dev
# RUN apt-get autoremove -y
# RUN rm -rf /tmp/pg_parquet
# RUN rm -rf /root/.cargo/registry
# RUN rm -rf /root/.cargo/git

# Create directory for parquet files in a writable location
RUN mkdir -p /var/lib/postgresql/parquet

# Set permissions
RUN chown -R postgres:postgres /var/lib/postgresql/parquet

RUN mkdir ~/.pgrx/data-15/
RUN mkdir ~/.pgrx/data-17/

RUN echo "shared_preload_libraries = 'pg_parquet'" >> ~/.pgrx/data-15/postgresql.conf
RUN echo "shared_preload_libraries = 'pg_parquet'" >> ~/.pgrx/data-17/postgresql.conf

# Copy initialization script
COPY init.sql /docker-entrypoint-initdb.d/ 

The zipfile contains the parquet file that I'm loading

20250808_155231_00159_p2a26_7985abc6-7137-4a02-959c-c67d52adaa47.zip

So, in my IDE, if the database connection times out, or I close it, and I run:

COPY gold_attendance.dim_academic_year FROM '/var/lib/postgresql/parquet/gold/70385/attendance/dims/academic_year/parquet/20250808_155231_00159_p2a26_7985abc6-7137-4a02-959c-c67d52adaa47' (format 'parquet', match_by 'name');

Every time, I will receive an error message:

[2025-08-11 18:31:50] [22023] ERROR: COPY format "parquet" not recognized
[2025-08-11 18:31:50] Position: 190

But it's a success every time if I run:

DROP EXTENSION IF EXISTS pg_parquet;
CREATE EXTENSION if not exists pg_parquet;
COPY gold_attendance.dim_academic_year FROM '/var/lib/postgresql/parquet/gold/70385/attendance/dims/academic_year/parquet/20250808_155231_00159_p2a26_7985abc6-7137-4a02-959c-c67d52adaa47' (format 'parquet', match_by 'name');

And if I run select * from pg_catalog.pg_extension; I get the following result:

oid extname extowner extnamespace extrelocatable extversion extconfig extcondition
13569 plpgsql 10 11 false 1.0 null null
16698 pg_parquet 10 2200 false 0.4.1 null null

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions