When planning for the deployment or migration to any Azure storage type, there are a number of design factors to take into account. The current post lists Azure Storage design considerations and can be used as a design guide by Azure architects and pre-sales engineers when designing Azure storage solutions.
Azure storage account types
The first step of your Azure storage design considerations when choosing storage for your solution, is to review your infrastructure and application requirements and choose the proper storage types. Azure offers the following storage types as of late February 2023.
- Blob (using containers)
- Azure files (SMB file shares)
For more details on the available relational and non-relational data types in Azure, refer to the Azure DP-900 certification exam curriculum at: https://stefanos.cloud/blog/microsoft-dp-900-certification-exam-study-guide/.
When provisioning a new Azure storage account, the following general options are available for performance. This setting cannot be changed after storage account creation.
- Standard (general purpose V2)
- Premium (for low latency scenarios)
The following security options can be set for any Azure storage account.
- Require secure transfer for REST API operations
- Allow enabling public access on containers. Blob containers, by default, do not permit public access to their content. This setting allows authorized users to selectively enable public access on specific containers. You can use Azure policy to audit this setting or prevent this setting from being enabled.
- Enable storage account key access
- Default to Azure Active Directory authorization in the Azure portal
- Minimum TLS version
- Permitted scope for copy operations
The following access tiers are available and only applicable to blob data:
- Hot (online)
- Cool (online)
- Archive (offline). The archive access tier is not an available option during storage account resource creation. The archive tier is an offline tier for storing data that is rarely accessed. The archive access tier has the lowest storage cost. However, this tier has higher data retrieval costs with a higher latency as compared to the hot and cool tiers.
Region and zone placement
In some cases, a storage account can be provisioned at an Azure Edge zone.
More information about Azure public multi-access edge compute (MEC) can be found at https://azure.microsoft.com/en-us/solutions/public-multi-access-edge-compute-mec/#overview.
Redundancy relates to the way each storage account is being synced to other Azure zones and regions to achieve high availability. The following Azure storage redundancy options are available.
- LRS. Locally-redundant storage. Suitable for non-critical scenarios.
- GRS. Geo-redundant storage. Recommended for backup scenarios.
- ZRS. Zone-redundant storage. Recommended for high availability scenarios.
- GZRS. Geo-zone redundant storage. Includes the offerings and benefits of both GRS and ZRS.
- RA-GRS. This option is a variation of the GRS option with the addition of the “Make read access to data available in the event of regional unavailability” option.
- RA-GZRS. This option is a variation of the GZRS option with the addition of the “Make read access to data available in the event of regional unavailability” option.
Depending on your chosen redundancy option, you shall have different options available under the Data Management –> Redundancy blade of the storage account in the Azure portal, as shown in the example below.
Network connectivity and network routing options
The following network connectivity and network routing options are available for Azure storage accounts.
- Network connectivity. You can connect to your storage account either publicly, via public IP addresses or service endpoints, or privately, using a private endpoint.
- Network routing. Determine how to route your traffic as it travels from the source to its Azure endpoint. Microsoft network routing is recommended for most customers.
The following data protection options can be configured for an Azure storage account.
Point-in-time restore and hierarchical namespace cannot be enabled simultaneously. Also versioning and hierarchical namespace cannot be enabled simultaneously. When point-in-time restore is enabled, versioning, blob change feed and blob soft delete are also enabled. The retention periods for each of these features must be greater than that of point-in-time restore, if applicable.
The following encryption options can be used for any Azure account.
Other Azure storage configuration options
The following additional options can be set during Azure storage account creation. There are dependencies between various options, in that some options cannot be enabled if a combination of other options is selected.
- Enable hierarchical namespace. The Data Lake Storage Gen2 hierarchical namespace accelerates big data analytics workloads and enables file-level access control lists (ACLs).
- Enable SFTP. Requires hierarchical namespace.
- Enable network file system v3. Enables the Network File System Protocol for your storage account that allows users to share files across a network. This option must be set during storage account creation.
- Allow cross-tenant replication. Cross-tenant replication and hierarchical namespace cannot be enabled simultaneously. Allow object replication to copy blobs to a destination account on a different Azure Active Directory (Azure AD) tenant. Not enabling cross-tenant replication will limit object replication within the same Azure AD tenant.
- Enable large file shares. Provides file share support up to a maximum of 100 TiB. Large file share storage accounts do not have the ability to convert to geo-redundant storage offerings and upgrade is permanent. This option cannot be changed after storage account creation.
External access options
The “Networking” blade in the Azure portal provides the following options for connecting to an Azure storage account.
Public network access
This decides whether public access will be enabled or not and if yes, from which networks and IP addresses. If no, access to the storage account will only be available via private endpoint connections (maximum security).
Resource instances and network routing
This decides which Azure resource types will have access to the storage account and how traffic will be routed from the external endpoint to the Azure storage account.
The “Access Keys”, “Shared Access Signature (SAS)” and “Lifecycle management” blades in the Azure portal dictate how external access will be provisioned and how Azure storage account data will be preserved or disposed of, as per predefined policy metrics.
Access keys and corresponding connection strings
A shared access signature (SAS) is a URI that grants restricted access rights to Azure Storage resources. You can provide a shared access signature to clients who should not be trusted with your storage account key but whom you wish to delegate access to certain storage account resources. By distributing a shared access signature URI to these clients, you grant them access to a resource for a specified period of time. An account-level SAS can delegate access to multiple storage services (i.e. blob, file, queue, table). Note that stored access policies are currently not supported for an account-level SAS.
Shared Access Signature (SAS) options include connection string, SAS token and SAS URL for each of the available storage types (blog, file, table, queue)
Data migration options
There are a variety of Azure data transfer solutions available for customers. In the Azure portal under the “Data Migration” blade, select the resource type and transfer scenario, based on which the Azure portal wizard will guide you to the solution that best fits your scenario. Please note that the data transfer rate you observe is impacted by the size and number of files in the transfer, as well as your infrastructure performance and network utilization by other applications. An example of the data migration wizard is shown below.
The most notable tools for migrating data to Azure storage accounts are the following: