From 89ee9b0c9b27324a3662e5b50b56902eef7d7749 Mon Sep 17 00:00:00 2001 From: Devin D'Angelo Date: Tue, 20 Feb 2024 20:09:36 -0500 Subject: [PATCH] Update Docs with Copy partition_by support (#9275) * update copy docs * prettier --- docs/source/user-guide/sql/dml.md | 12 ++++++++++++ docs/source/user-guide/sql/write_options.md | 8 +++++--- 2 files changed, 17 insertions(+), 3 deletions(-) diff --git a/docs/source/user-guide/sql/dml.md b/docs/source/user-guide/sql/dml.md index 79b1d6625e8f..405e77a21b26 100644 --- a/docs/source/user-guide/sql/dml.md +++ b/docs/source/user-guide/sql/dml.md @@ -57,6 +57,18 @@ files in the `dir_name` directory: +-------+ ``` +Copy the contents of `source_table` to multiple directories +of hive-style partitioned parquet files: + +```sql +> COPY source_table TO 'dir_name' (FORMAT parquet, partition_by 'column1, column2'); ++-------+ +| count | ++-------+ +| 2 | ++-------+ +``` + Run the query `SELECT * from source ORDER BY time` and write the results (maintaining the order) to a parquet file named `output.parquet` with a maximum parquet row group size of 10MB: diff --git a/docs/source/user-guide/sql/write_options.md b/docs/source/user-guide/sql/write_options.md index 09d51903f4b1..ac0a41a97f07 100644 --- a/docs/source/user-guide/sql/write_options.md +++ b/docs/source/user-guide/sql/write_options.md @@ -56,6 +56,7 @@ TO 'test/table_with_options' (format parquet, compression snappy, 'compression::col1' 'zstd(5)', +partition_by 'column3, column4' ) ``` @@ -67,9 +68,10 @@ In this example, we write the entirety of `source_table` out to a folder of parq The following special options are specific to the `COPY` command. -| Option | Description | Default Value | -| ------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------- | -| FORMAT | Specifies the file format COPY query will write out. If there're more than one output file or the format cannot be inferred from the file extension, then FORMAT must be specified. | N/A | +| Option | Description | Default Value | +| ------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------- | +| FORMAT | Specifies the file format COPY query will write out. If there're more than one output file or the format cannot be inferred from the file extension, then FORMAT must be specified. | N/A | +| PARTITION_BY | Specifies the columns that the output files should be partitioned by into separate hive-style directories. Value should be a comma separated string literal, e.g. 'col1,col2' | N/A | ### JSON Format Specific Options