@@ -22,6 +22,7 @@ COPY table FROM 's3://mybucket/data.parquet' WITH (format 'parquet');
2222 - [ Inspect Parquet schema] ( #inspect-parquet-schema )
2323 - [ Inspect Parquet metadata] ( #inspect-parquet-metadata )
2424 - [ Inspect Parquet column statistics] ( #inspect-parquet-column-statistics )
25+ - [ List and read Parquet files from uri pattern] ( #list-and-read-parquet-files-from-uri-pattern )
2526- [ Object Store Support] ( #object-store-support )
2627- [ Copy Options] ( #copy-options )
2728- [ Configuration] ( #configuration )
@@ -200,6 +201,40 @@ SELECT * FROM parquet.column_stats('/tmp/product_example.parquet')
200201(13 rows)
201202```
202203
204+ ### List and read Parquet files from uri pattern
205+
206+ You can call ` SELECT * FROM parquet.list(<uri_pattern>) ` to see all uris that matches with the uri pattern.
207+ Uri pattern can resolve ` ** ` for directories and ` * ` for words in the uri.
208+
209+
210+ ``` sql
211+ COPY (SELECT i FROM generate_series(1 , 1000000 ) i) TO ' /tmp/some/test.parquet' with (file_size_bytes ' 1MB' );
212+ COPY 1000000
213+
214+ SELECT * FROM parquet .list (' /tmp/some/**/*.parquet' );
215+ uri | size
216+ -- -------------------------------------+---------
217+ / tmp/ some/ test .parquet / data_4 .parquet | 100162
218+ / tmp/ some/ test .parquet / data_3 .parquet | 1486916
219+ / tmp/ some/ test .parquet / data_2 .parquet | 1486916
220+ / tmp/ some/ test .parquet / data_0 .parquet | 1486920
221+ / tmp/ some/ test .parquet / data_1 .parquet | 1486916
222+ (5 rows)
223+
224+ ```
225+
226+ Uri pattern is also supported by ` COPY FROM ` for all supported object stores except ` http(s) ` endpoints.
227+ ``` sql
228+ COPY (SELECT i FROM generate_series(1 , 1000000 ) i) TO ' s3://testbucket/some/test.parquet' with (file_size_bytes ' 1MB' );
229+ COPY 1000000
230+
231+ CREATE TABLE test (a int );
232+ CREATE TABLE
233+
234+ COPY test FROM ' s3://testbucket/some/**/*.parquet' ;
235+ COPY 1000000
236+ ```
237+
203238## Object Store Support
204239` pg_parquet ` supports reading and writing Parquet files from/to ` S3 ` , ` Azure Blob Storage ` , ` http(s) ` and ` Google Cloud Storage ` object stores.
205240
@@ -287,7 +322,7 @@ Supported authorization methods' priority order is shown below:
287322
288323#### Http(s) Storage
289324
290- ` Https ` uris are supported by default. You can set ` ALLOW_HTTP ` environment variable to allow ` http ` uris.
325+ Only ` https ` uris are supported by default. You can set ` ALLOW_HTTP ` environment variable to allow ` http ` uris.
291326
292327#### Google Cloud Storage
293328
0 commit comments