airflow.providers.microsoft.azure.hooks.wasb
¶
This module contains integration with Azure Blob Storage.
It communicate via the Window Azure Storage Blob protocol. Make sure that a Airflow connection of type wasb exists. Authorization can be done by supplying a login (=Storage account name) and password (=KEY), or login and SAS token in the extra field (see connection wasb_default for an example).
Module Contents¶
Classes¶
Interacts with Azure Blob Storage through the |
- class airflow.providers.microsoft.azure.hooks.wasb.WasbHook(wasb_conn_id: str = default_conn_name, public_read: bool = False)[source]¶
Bases:
airflow.hooks.base.BaseHook
Interacts with Azure Blob Storage through the
wasb://
protocol.These parameters have to be passed in Airflow Data Base: account_name and account_key.
Additional options passed in the 'extra' field of the connection will be passed to the BlockBlockService() constructor. For example, authenticate using a SAS token by adding {"sas_token": "YOUR_TOKEN"}.
If no authentication configuration is provided, managed identity will be used (applicable when using Azure compute infrastructure).
- Parameters
wasb_conn_id (str) -- Reference to the wasb connection.
public_read (bool) -- Whether an anonymous public read access should be used. default is False
- static get_connection_form_widgets() Dict[str, Any] [source]¶
Returns connection widgets to add to connection form
- check_for_blob(self, container_name: str, blob_name: str, **kwargs) bool [source]¶
Check if a blob exists on Azure Blob Storage.
- check_for_prefix(self, container_name: str, prefix: str, **kwargs)[source]¶
Check if a prefix exists on Azure Blob storage.
- get_blobs_list(self, container_name: str, prefix: Optional[str] = None, include: Optional[List[str]] = None, delimiter: Optional[str] = '/', **kwargs) List [source]¶
List blobs in a given container
- Parameters
container_name (str) -- The name of the container
prefix (str) -- Filters the results to return only blobs whose names begin with the specified prefix.
include (List[str]) -- Specifies one or more additional datasets to include in the response. Options include:
snapshots
,metadata
,uncommittedblobs
,copy`, ``deleted
.delimiter (str) -- filters objects based on the delimiter (for e.g '.csv')
- load_file(self, file_path: str, container_name: str, blob_name: str, **kwargs) None [source]¶
Upload a file to Azure Blob Storage.
- load_string(self, string_data: str, container_name: str, blob_name: str, **kwargs) None [source]¶
Upload a string to Azure Blob Storage.
- get_file(self, file_path: str, container_name: str, blob_name: str, **kwargs)[source]¶
Download a file from Azure Blob Storage.
- read_file(self, container_name: str, blob_name: str, **kwargs)[source]¶
Read a file from Azure Blob Storage and return as a string.
- upload(self, container_name, blob_name, data, blob_type: str = 'BlockBlob', length: Optional[int] = None, **kwargs) Dict[str, Any] [source]¶
Creates a new blob from a data source with automatic chunking.
- Parameters
container_name (str) -- The name of the container to upload data
blob_name (str) -- The name of the blob to upload. This need not exist in the container
data -- The blob data to upload
blob_type (storage.BlobType) -- The type of the blob. This can be either
BlockBlob
,PageBlob
orAppendBlob
. The default value isBlockBlob
.length (int) -- Number of bytes to read from the stream. This is optional, but should be supplied for optimal performance.
- download(self, container_name, blob_name, offset: Optional[int] = None, length: Optional[int] = None, **kwargs) azure.storage.blob.StorageStreamDownloader [source]¶
Downloads a blob to the StorageStreamDownloader
- Parameters
- create_container(self, container_name: str) azure.storage.blob.ContainerClient [source]¶
Create container object if not already existing
- Parameters
container_name (str) -- The name of the container to create
- delete_container(self, container_name: str) None [source]¶
Delete a container object
- Parameters
container_name (str) -- The name of the container
- delete_blobs(self, container_name: str, *blobs, **kwargs) None [source]¶
Marks the specified blobs or snapshots for deletion.
- delete_file(self, container_name: str, blob_name: str, is_prefix: bool = False, ignore_if_missing: bool = False, delimiter: str = '', **kwargs) None [source]¶
Delete a file from Azure Blob Storage.
- Parameters
container_name (str) -- Name of the container.
blob_name (str) -- Name of the blob.
is_prefix (bool) -- If blob_name is a prefix, delete all matching files
ignore_if_missing (bool) -- if True, then return success even if the blob does not exist.
kwargs (object) -- Optional keyword arguments that
ContainerClient.delete_blobs()
takes.