Running an Azure migration can entail many services and parameters which should taken into account. In the case of an Azure migration Proof of Concept (PoC), the Azure PoC subscription is used as an intermediary platform on which to perform an initial migration, test the proof of concept and validate functionality and then switch-over to production.
Running an Azure migration is never a straightforward or linear process, as with any migration. It is a custom and complex piece of work and therefore planning is of utmost importance. There are various Azure migration design considerations which need to be taken into account. This article lists some high-level Azure migration design considerations. The high-level design considerations can generally be split into two major categories:
- Functional design (operational aspects, Azure resources and features needed to cover your infrastructure and application operational requirements)
- Non-functional design covering all non-functional aspects of your Azure or hybrid cloud infrastructure, including billing, performance, high availability and resiliency, security/privacy, compliance, governance as well as business continuity and disaster recovery (BCDR) aspects.
Before starting with your design, check whether you or your customer is eligible for the Microsoft Azure Fast Track program.
The following steps should be considered when carrying out an Azure migration:
- Perform business and technical assessment of the current production environment. This is a complex auditing process and should be adapted to cover all services, apps and data in-scope for the migration. An Azure migration assessment template can be used, which will include all required assessment parameters (for example areas such as on-premise, cloud, authentication, datastores, security and performance). As a result of the assessment, a mapping of current services into Azure services will take place. While performing the initial technical assessment of your source environment, you should definitely consider Azure pre-migration assessment tools.
- Design Azure target environment for the migration. This includes design of Azure tenant and Azure AD licensing/features, management groups, subscriptions, resource groups, tags and policies. One good practice is to create three subscriptions for large Azure tenants, namely one for dedicated resources (actual production applications, IaaS, PaaS, SaaS), one for shared resources (e.g. management, compliance, governance, connectivity, identity, storage, networking, etc) and one for test/sandbox. In large Azure environments, consider using Azure landing zones and Azure Blueprints. This, besides other things, will ensure that your target environment is built as per Microsoft best practices and will comply with Azure Well-Architected Framework (WAF) and Azure Cloud Adoption Framework (CAF). The Azure design should cover in detail all functional and non-functional aspects of the systems involved, including the Business Continuity and Disaster Recovery (BCDR), performance levels and security configuration of each Azure service involved. Additional specific design considerations should be made for each service, based on each service specific requirements (for example design considerations for Azure App Service, Azure database for MySQL, etc). Consider the usage of Azure Landing Zones, Azure Blueprints and Azure Policy. Azure Policies are usually deployed as part of Azure Blueprints and Azure Landing Zones. Always take into account CAF best practices for Azure Policy, as described at https://docs.microsoft.com/en-us/azure/cloud-adoption-framework/govern/guides/complex/ and https://docs.microsoft.com/en-us/azure/cloud-adoption-framework/decision-guides/policy-enforcement/.
- For achieving infrastructure redundancy (a.k.a resiliency and high availability) and scalability, employ the proper available method in each case and each Azure resource. Some examples include fault domains, update domains, VM availability sets and VM scale sets. You will always need to decide between local/DC vs zone vs region redundancy for all your resources.
- Consider utilizing the Azure Architecture Center and Azure Enterprise Scale designs offered by Microsoft. The Enterprise-Scale architecture provides prescriptive guidance coupled with Azure best practices, and it follows design principles across the critical design areas for organizations to define their Azure architecture. It will continue to evolve alongside the Azure platform and is ultimately defined by the various design decisions that organizations must make to define their Azure journey.The Enterprise-Scale architecture is modular by design and allows organizations to start with foundational landing zones that support their application portfolios, and the architecture enables organizations to start as small as needed and scale alongside their business requirements regardless of scale point.
- Consider naming conventions for your Azure resources. Instructions for applying a cohesive naming and tagging strategy for Azure resources can be found in the CAF framework at https://docs.microsoft.com/en-us/azure/cloud-adoption-framework/ready/azure-best-practices/naming-and-tagging. Naming conventions can be applied using native Azure tools and methodologies as well as other IAC languages, such as Terraform. Terraform has an Azure naming convention provider.
- Consider a proper tag structure for your Azure tenant. Some common tags to consider including in all your Azure resources are the following: Environment, CostCenter, SLA, Criticality, ApplicationOwner, Region/Location.
- Consider all network bandwidth to be consumed in the target Azure tenant. This includes inbound (ingress) and outbound (egress) traffic from/to the Internet and from external WAN links. It also includes the inter-VNET network traffic, both inside the same region as well as between different regions. Also consider network performance by running a PoC and testing various metrics such as RTT and latency of all your network paths. Last but not least, always employ the CAF network decision tree for deciding upon your network architecture in Azure.
- Consider your storage lifecycle policy and overall data management and lifecycle policy.
- Consider your user roles and provide roles to admins and users always using RBAC and by applying the least privilege principle. When feasible, utilize advanced features of Azure AD Premium P1 and P2 plans, such as access reviews, Privileged Identity Management (PIM), VM just in time access (JIT) and conditional access.
- Migrate a representative subset of resources from production to an intermediary Azure PoC environment. You can request an Azure PoC environment from Microsoft by submitting a request via your Tier-1 Microsoft CSP distributor or by other channels available by Microsoft. If your request is approved, you will have an Azure Pass offer with a PoC subscription to utilize for your preliminary migration tests. Otherwise you will have to work with a free Azure trial subscription for that purpose. When the PoC environment is fully configured, create functional tests, BCDR tests and performance/stress tests, coordinating systems engineers with software engineers in preparing and executing test scripts.
- Configure all security-related parameters in a security lock-down fashion and re-run all functional tests to verify your security config does not break any functionality. Also run security/penetration tests and security evaluation tests using services such as SSLLabs and Securityheaders. Microsoft Defender for Cloud Security Center is also a good place to start by starting to understand and analyze the Security Score in Microsoft Defender for Cloud and the Security Posture of your new Azure tenant.
- When all functional tests are completed in the PoC, you should move your Azure PoC subscription resources to your final Azure production subscription (for example Azure Enterprise Agreement or Azure CSP subscription). Alternatively, you can re-deploy all PoC resources from scratch in a new target Azure subscription, by using ARM templates or Azure BiCep. Take into account that there are certain limitations when moving resources from one subscription to another. Functional and non-functional testing should be carried out again in the final target subscription.
- Plan for a cut-off migration of your apps and data from the current production environment to the target Azure environment. Plan for an end user/customer communication plan and required downtime, as well as a fail-back plan in case something does not work as expected. A common fail-back plan is to revert the customer-facing DNS FQDN to point to the old production infrastructure. For this reason it is important to always set the TTL parameter of all involved DNS records to the lowest possible value.
- After the migration is completed, ensure that you make extensive usage of Azure Advisor for a list of to-do tasks for optimizing your Azure tenant cost, operations, security and performance.
- Perform cost optimization tasks to minimize costs in your target Azure infrastructure.