App Service Network Troubleshooting
My basic process behind debugging any sort of networking issue is to reduce the complexity of the issue and walk up the basic network stack. Often times we overlook a basic problem such as DNS resolution issues or the incorrect reference in code that can cause hours, days, or weeks going down a rabbit hole. This is specifically referencing Windows App Services but the same concept exists for Linux, you’ll just need to install the packages for those applications if they are not already on the image.
Another great blog from a colleague on the same topic:
Step 1 – Do you actually understand what the issue is?
What’s the error you’re getting? Before getting started understand the exact exception being thrown by the application. Is it a TCP connection timeout, an query or request timeout (that’s different from TCP connection timeout), a DNS resolution error, or access denied calling a port. The error could manifest in as a different error so do a simple Bing/Google search to validate what the error is. Each of these situations warrant a different method of debugging so make sure you don’t overlook the basic step of understanding, “What error am I getting”? For example something related to “hostname not found” is not a network connectivity problem, that’s an issue with DNS resolution.
Step 2 – Kudu is your best friend
If you don’t already know about Kudu (not Kudo as I hear some people call it 😀 ) and you use app services you are missing out on likely the most powerful tool you have to start debugging. It provides the ability to view the filesystem, take profile and memory dumps, view process information, and use the Cmd or Powershell associated with the underlying VM. To access Kudu navigate to your web app -> Advanced Tools -> Go. There’s a lot to cover for everything you can do in Kudu so I’ll maybe leave that for another time. The other method to access it is inserting .scm after the app name in the URL bar, for example appname.scm.azurewebsites.net.
Step 3 – Network related commands
I wrote a blog a couple years ago covering common network tools so for ease of use (and I’m lazy) I’m copying and pasting two of them as well as adding on with other commands I use a lot since the original blog was written.
- Tcpping.exe : This command is similar to ping or psping where you can test if a web app can reach an endpoint via a hostname or IP address or port. If a web app cannot reach an endpoint via hostname it’s always a good idea to test the IP address that corresponds to the hostname in case, there is an issue with the DNS lookup. Tcpping will always default to port 80 unless another port is specified, ie “<hostname or IP address>:port”. For more information about the command and additional switches, type tcpping in the console. Note that the -t and -n switches are best used in Kudu.Examples:
tcpping google.com:443 -t
- Nameresolver.exe : This command is similar to nslookup where it will do a DNS lookup against the DNS server that is configured for the web app. By default, a standard app service will use Azure DNS. If the app services is configured with VNET integration, this includes both ASE types as well, it will use your custom DNS servers configured for the VNET. To specify a different DNS server to complete the lookup on, add the IP address of the server after the hostname separated by a space, ie “hostname <DNS Server IP>”.Examples:
nameresolver google.com 220.127.116.11
- SET WEBSITE_DNS_ : This command will output the current DNS server that is being used by the web app. If the error Environment variable WEBSITE_DNS_ not defined is received, no custom DNS servers are configured for the web app.
- curl.exe : I cannot emphasize enough how powerful this tool is. I’ve used it to debug everything from TLS/SSL issues, cipher issues, FTP, SMTP, ect. Since its not a unique command for Azure I won’t go into much detail but simply doing a search on “curl” and the protocol you want to use should yield results.
Curl is a command line tool that allows you to make requests using different protocols.
A couple things I’ll mention is always tack on the –verbose flag as that will show more debug information about the IP, ciphers, HTTP version, and TLS information. You can also tack on –tlsv1.x to specify a specific TLS version to troubleshoot potential TLS related issues.
- sql cmd – I’m far from a SQL expert but I’ve used this tool to troubleshoot some basic
Step 4 – My debugging process
Technically you skip some of these steps but just for sanity I’d recommend going through them all to avoid missing something obvious. My general practice is walk up the network stack and how an app will work, starting with DNS, then TCP connectivity, and finally the actual protocol (HTTP, FTP, SMTP, ect) you’re trying to connect to.
- DNS – What is the hostname or custom domain you are trying to access? If you are simply using an IP you can skip this step. First validate is the correct hostname resolving to the expected IP. Under Kudu Console -> Debugging Console -> CMD use the command nameresolver to validate that the hostname is returning the correct IP against the expected DNS server. If its not try using nameresolver domain IPofDNSServer as described earlier in case for some reason the app is not targeting the expected DNS server. Lastly if its still failing, make sure any other domain that should be working is resolving. I cannot tell you how many times the crux of the problem is found after understanding what is happening during the DNS resolution.
If this is failing:
– Confirm what DNS server the app is using, if you aren’t sure use the nameresolver domain IPofDNSServer
– Try step #2 below and tcpping IPofDNSServer:53 to validate the app has connectivity to the DNS server.
– If that is failing try targeting a well known DNS server like 18.104.22.168 which is Google’s DNS server to see if something like www.microsoft.com is resolving.
– If this is failing check the network path and the DNS servers. If you have a secondary try that DNS server as well. I’ve seen multiple times where unexpectedly a customer’s DNS server is unresponsive causing the issue.
- TCP connectivity – After we know we are using the right IP address we need to test TCP connectivity to that endpoint. Tests like an ICMP don’t really tell us if TCP connectivity is working on a specific port so we need to us tcpping which is similar to psping which allows us to target a specific IP and port combination. If TCP connectivity is not working trying to make an HTTP, SQL, ect call isn’t going to work either. I’d highly recommend trying to tcpping both the custom domain and specific IP just in case you may have an issue with DNS, for example:
If this is failing here’s a couple reasons:
– You’re trying to connect to an private IP and the app is not running in an App Service Environment (ASE) or if in the multi-tenant space it does not have VNET integration. You cannot talk to internal only IPs without one of the options above from an app service.
– The service you are connecting to is blocking the IP of the web app. See this doc to understand how outbound IPs work with app services.
– If VNET integration is enabled check routes, NSGs, or appliances in the path to that endpoint that could be blocking connectivity. Another useful test is from a VM in the same VNET with the same UDRs and routes (you won’t be able to do same subnet) can you connect to said service. Debugging an issue on a VM is a little less cumbersome as you have a lot more control,
– Check any known limitations for VNET integration, for example at the time of this blog (9/15/21) port 25 is not supported with regional VNET integration. This likely will change in the future.
- Isolate app vs something else
The cool thing about Kudu is that you essentially can test the connectivity completely outside the context of your application code. This is a helpful step as you may not own the code, there could be a potential bug or misconfiguration, or the app is just not behaving as expected. The next step is possible is to test the protocol you’re trying to connect with using curl or SQL cmd. As mentioned above CURL supports a wide variety of protocols and I’ve used it to debug FTP, SMTP, and most of the time HTTP related issues.
Ways this is helpful:
– You can specify TLS versions to identify if the reason your app is failing to connect due to it using an older TLS version such TLS 1.0 or 1.1 by specifying –tlsv1.0 in the curl request
– You can run some operation like downloading a file to confirm if potential download speeds are related to your code or the platform using curl to call something like blob storage
– You could build an HTTP Put, Post, Patch or Delete with a specific body to full understand how authentication is working without potential ambiguity in your code. You can also see response headers that could showcase if some appliance or proxy in the middle is actually the service returning the 403 or 503.
– Test SMTP connectivity as described in my blog here: https://blog.brooksjc.com/2018/09/18/troubleshooting-smtp-issues-sending-emails-from-azure-web-apps/
- Finally if/once everything is working from your testing in Kudu with the proper DNS resolution, TCP connectivity, and protocol connections using the right IP and port combination and its still not working its likely an application issue. From there try and simply the app so you know for sure the specific endpoint, port, and execution that is occurring.