airflow.providers.microsoft.azure.hooks.wasb
¶
This module contains integration with Azure Blob Storage.
It communicate via the Window Azure Storage Blob protocol. Make sure that a Airflow connection of type wasb exists. Authorization can be done by supplying a login (=Storage account name) and password (=KEY), or login and SAS token in the extra field (see connection wasb_default for an example).
Module Contents¶
-
class
airflow.providers.microsoft.azure.hooks.wasb.
WasbHook
(wasb_conn_id: str = default_conn_name, public_read: bool = False)[source]¶ Bases:
airflow.hooks.base.BaseHook
Interacts with Azure Blob Storage through the
wasb://
protocol.These parameters have to be passed in Airflow Data Base: account_name and account_key.
Additional options passed in the ‘extra’ field of the connection will be passed to the BlockBlockService() constructor. For example, authenticate using a SAS token by adding {“sas_token”: “YOUR_TOKEN”}.
- Parameters
-
_get_container_client
(self, container_name: str)[source]¶ Instantiates a container client
- Parameters
container_name (str) – The name of the container
- Returns
ContainerClient
-
check_for_blob
(self, container_name: str, blob_name: str, **kwargs)[source]¶ Check if a blob exists on Azure Blob Storage.
-
check_for_prefix
(self, container_name: str, prefix: str, **kwargs)[source]¶ Check if a prefix exists on Azure Blob storage.
-
get_blobs_list
(self, container_name: str, prefix: Optional[str] = None, include: Optional[List[str]] = None, delimiter: Optional[str] = '/', **kwargs)[source]¶ List blobs in a given container
- Parameters
container_name (str) – The name of the container
prefix (str) – Filters the results to return only blobs whose names begin with the specified prefix.
include (List[str]) – Specifies one or more additional datasets to include in the response. Options include:
snapshots
,metadata
,uncommittedblobs
,copy`, ``deleted
.delimiter (str) – filters objects based on the delimiter (for e.g ‘.csv’)
-
load_file
(self, file_path: str, container_name: str, blob_name: str, **kwargs)[source]¶ Upload a file to Azure Blob Storage.
-
load_string
(self, string_data: str, container_name: str, blob_name: str, **kwargs)[source]¶ Upload a string to Azure Blob Storage.
-
get_file
(self, file_path: str, container_name: str, blob_name: str, **kwargs)[source]¶ Download a file from Azure Blob Storage.
-
read_file
(self, container_name: str, blob_name: str, **kwargs)[source]¶ Read a file from Azure Blob Storage and return as a string.
-
upload
(self, container_name, blob_name, data, blob_type: str = 'BlockBlob', length: Optional[int] = None, **kwargs)[source]¶ Creates a new blob from a data source with automatic chunking.
- Parameters
container_name (str) – The name of the container to upload data
blob_name (str) – The name of the blob to upload. This need not exist in the container
data – The blob data to upload
blob_type (storage.BlobType) – The type of the blob. This can be either
BlockBlob
,PageBlob
orAppendBlob
. The default value isBlockBlob
.length (int) – Number of bytes to read from the stream. This is optional, but should be supplied for optimal performance.
-
download
(self, container_name, blob_name, offset: Optional[int] = None, length: Optional[int] = None, **kwargs)[source]¶ Downloads a blob to the StorageStreamDownloader
- Parameters
-
create_container
(self, container_name: str)[source]¶ Create container object if not already existing
- Parameters
container_name (str) – The name of the container to create
-
delete_container
(self, container_name: str)[source]¶ Delete a container object
- Parameters
container_name (str) – The name of the container
-
delete_blobs
(self, container_name: str, *blobs, **kwargs)[source]¶ Marks the specified blobs or snapshots for deletion.
-
delete_file
(self, container_name: str, blob_name: str, is_prefix: bool = False, ignore_if_missing: bool = False, **kwargs)[source]¶ Delete a file from Azure Blob Storage.
- Parameters
container_name (str) – Name of the container.
blob_name (str) – Name of the blob.
is_prefix (bool) – If blob_name is a prefix, delete all matching files
ignore_if_missing (bool) – if True, then return success even if the blob does not exist.
kwargs (object) – Optional keyword arguments that
ContainerClient.delete_blobs()
takes.