azureBlobStorage Table Function
Provides a table-like interface to select/insert files in Azure Blob Storage. This table function is similar to the s3 function.
Syntax
Arguments
Argument | Description |
---|---|
connection_string | storage_account_url | connection_string includes account name & key (Create connection string) or you could also provide the storage account url here and account name & account key as separate parameters (see parameters account_name & account_key) |
container_name | Container name |
blobpath | file path. Supports following wildcards in readonly mode: * , ** , ? , {abc,def} and {N..M} where N , M — numbers, 'abc' , 'def' — strings. |
account_name | if storage_account_url is used, then account name can be specified here |
account_key | if storage_account_url is used, then account key can be specified here |
format | The format of the file. |
compression | Supported values: none , gzip/gz , brotli/br , xz/LZMA , zstd/zst . By default, it will autodetect compression by file extension. (same as setting to auto ). |
structure | Structure of the table. Format 'column1_name column1_type, column2_name column2_type, ...' . |
partition_strategy | Parameter is optional. Supported values: WILDCARD or HIVE . WILDCARD requires a {_partition_id} in the path, which is replaced with the partition key. HIVE does not allow wildcards, assumes the path is the table root, and generates Hive-style partitioned directories with Snowflake IDs as filenames and the file format as the extension. Defaults to WILDCARD |
partition_columns_in_data_file | Parameter is optional. Only used with HIVE partition strategy. Tells ClickHouse whether to expect partition columns to be written in the data file. Defaults false . |
Returned value
A table with the specified structure for reading or writing data in the specified file.
Examples
Similar to the AzureBlobStorage table engine, users can use Azurite emulator for local Azure Storage development. Further details here. Below we assume Azurite is available at the hostname azurite1
.
Write data into azure blob storage using the following :
And then it can be read using
or using connection_string
Virtual Columns
_path
— Path to the file. Type:LowCardinality(String)
._file
— Name of the file. Type:LowCardinality(String)
._size
— Size of the file in bytes. Type:Nullable(UInt64)
. If the file size is unknown, the value isNULL
._time
— Last modified time of the file. Type:Nullable(DateTime)
. If the time is unknown, the value isNULL
.
Partitioned Write
Partition Strategy
Supported for INSERT queries only.
WILDCARD
(default): Replaces the {_partition_id}
wildcard in the file path with the actual partition key.
HIVE
implements hive style partitioning for reads & writes. It generates files using the following format: <prefix>/<key1=val1/key2=val2...>/<snowflakeid>.<toLower(file_format)>
.
Example of HIVE
partition strategy
use_hive_partitioning setting
This is a hint for ClickHouse to parse hive style partitioned files upon reading time. It has no effect on writing. For symmetrical reads and writes, use the partition_strategy
argument.
When setting use_hive_partitioning
is set to 1, ClickHouse will detect Hive-style partitioning in the path (/name=value/
) and will allow to use partition columns as virtual columns in the query. These virtual columns will have the same names as in the partitioned path, but starting with _
.
Example
Use virtual column, created with Hive-style partitioning
Using Shared Access Signatures (SAS)
A Shared Access Signature (SAS) is a URI that grants restricted access to an Azure Storage container or file. Use it to provide time-limited access to storage account resources without sharing your storage account key. More details here.
The azureBlobStorage
function supports Shared Access Signatures (SAS).
A Blob SAS token contains all the information needed to authenticate the request, including the target blob, permissions, and validity period. To construct a blob URL, append the SAS token to the blob service endpoint. For example, if the endpoint is https://clickhousedocstest.blob.core.windows.net/
, the request becomes:
Alternatively, users can use the generated Blob SAS URL: