How to perform troubleshooting of Azure Application Gateway issues

Case #

You are using Azure Application Gateway in your solution and need to troubleshoot errors which you receive while running your workload. Application Gateway runs as an Application Delivery Controller (ADC) for many different workloads, including Azure App Service, Azure storage and Azure VM. A typical example of this would be a backend App Service. When trying to access the App Service via the Application Gateway you may come across all types of HTTP(S) errors.

The following article explains the available HTTP(S) status and error codes.

https://developer.mozilla.org/en-US/docs/Web/HTTP/Status

This article provides high-level guidance on how to perform troubleshooting of Azure Application Gateway issues.

Solution #

Issue replication #

As with troubleshooting any kind of technical issue, you first need to be able to replicate the issue and understand the pattern in which it manifests (which software modules, which users, which circumstances trigger the event). By allowing some time to replicate the issue under different circumstances you allow yourself to narrow down the issue scope, based on the specifics of your environment.

Configuration checks #

First thing you need to do is double-check your Application Gateway configuration to exclude any misconfigurations in the first place and validate your design against Microsoft best practices. In the case of integration with Azure App Service (which is a very common use case for App Gateway), you should first consult the following articles.

Typical misconfiguration issues include:

  • HttpSetting Request Timeout Is Less Than Probe Timeout
  • Timeout values in App Gateway Backend Settings and/or backend health probe is lower than your software application time out values. You can increase the Request Timeout value to suit your backend / app response time. Notice that the Public IP also has some idle timeout (4 minutes minimum) that might be needed to be changed to adapt your request timeout.
  • PickHostNameFromBackendAddress Not Enabled For Azure Web App
  • Authentication Certificate Uploaded For Azure Web App
  • Backend health of all your backend resources is ok, as shown in below example.

One of the most common Application Gateway HTTP(S) errors is the HTTP 502 error (Bad gateway). This is a very generic error and can be attributed to various different root causes. You can find more troubleshooting guidance from the article below:

Troubleshoot Bad Gateway errors - Azure Application Gateway | Microsoft Learn

Finally ensure that the account limits and quotas of your Azure subscription and resources are not reached, as described at: https://learn.microsoft.com/en-us/azure/azure-resource-manager/management/azure-subscription-service-limits#application-gateway-limits.

Tracing procedures #

If your design includes integration with Azure App Service and Web Apps, replicate the issue after running a Fiddler capture. Ensure that Fiddler can decrypt the HTTPs traffic as per the following article: Decrypt HTTPS traffic | Fiddler Classic (telerik.com) and https://docs.microsoft.com/en-us/azure/application-gateway/how-to-troubleshoot-application-gateway-session-affinity-issues#use-web-debugger-to-capture-and-analyze-the-http-or-https-traffics.

Other tracing tools which you can utilize with Azure Application Gateway are the following.

  • Native tools provided in your App Gateway resource under the "Diagnose and solve problems" section. The following troubleshooting checks are available:
    • 502 errors
    • End to end SSL or SSL offload
    • Configuration update failure
    • Configuration and Setup
    • Connectivity
    • Web Application Firewall (WAF)
    • Ingress controller for AKS - Add-on
    • Ingress controller for AKS - Github
    • Performance and scaling
    • Routing
    • Monitoring and logging
  • The native Azure Network Watcher "Connection troubleshoot" tool provided under the monitoring section.
  • The native "Insights" tool provided under the monitoring section.

Bear in mind that it is not possible to take network captures on the Application Gateway. In this case, the tools used for troubleshooting are Fiddler (to check the HTTP headers and behavior) as well as the metrics provided on the Azure management portal.

Further useful Application Gateway troubleshooting documentation is provided below.

  1. Documentation about Fiddler: https://docs.microsoft.com/en-us/azure/azure-web-pubsub/howto-troubleshoot-network-trace#collect-a-network-trace-with-fiddler
  2. Documentation about how to troubleshoot 502 error: Troubleshoot Bad Gateway errors - Azure Application Gateway | Microsoft Learn
  3. Documentation about how to troubleshoot backend health: Troubleshoot backend health issues in Azure Application Gateway | Microsoft Learn
  4. Documentation on how to use Metrics to monitor your Application Gateway:

Last but not least, to measure the Application Gateway performance, consult the following articles:

Powered by BetterDocs