In the last 4 months I have become a quite active user of Azure app services using Linux containers. There are multiple reasons why Linux containers should be chosen in my opinion:
- They are cheaper.
- Providing Docker images gives us nice reusability.
- This is indeed serverless as the whole infrastructure is managed by Azure.
- We can run multiple containers within one instance.
- We can choose between well configured default images or roll out our own ones
All in all quite simple, right? Except, that this is still beta software and documentation is not up to the standard set by established Azure services. Furthermore, there are some cruel pitfalls that can make your life hell. There are three lessons I've learned in the last couple of months.
Let's cut to the chase and list them right away:
- Keep alive is a must.
- No support for the permessage-deflate WS extension.
- Azure metrics / ping services are not reliable.
Regarding the first point the message is quite important: Azure offers "unlimited" incoming HTTP connections, but will cut you off at ~150 outgoing ones. 150?! That is really low! I expected that number at least 1 magnitude higher. Nevertheless, normally this problem is easily solved via the keep alive options. There are some other HTTP / TCP/IP socket settings that should be applied for best performance, but these are rather just cosmetics. The big point is that if you do not want to suddenly be stalled in your backend you will need the keep alive option.
Now the second point may only apply to some pre-configured images. At least I (surprisingly) did not have any problems with custom images and websockets. That changed when I used the offered Node.js (8+) image as a basis. Even though my websocket worked flawlessly locally it would not work when hosted within the Linux app container. Luckily, after researching the topic I encountered the respective Wiki on GitHub. In there an FAQ entry covered exactly that topic... Turns out one needs to disable the permessage-deflate extension, which can be done in Node.js quite easily.
The last point may be rather generic, but actually I never had such problems with standard (Windows) app services in the past. The story goes as follows: We set up several alerts that trigger when certain response times are exceeded. We repeat the tests to really confirm the initial results before sending any message. Nevertheless, even though this seems to be quite robust theoretically we get several messages per day that remind us of potential downtimes. Fair to say that a downtime never happened.
Running our own ping tests side-by-side we see quite some differences arising here. How can it be that a ping test inside the Azure data center yields worse data than one from outside networks? Seems hard to explain to me ...
Bottom line: Azure and Linux still fits, there are just the typical beta problems (edge cases, lack of documentation, ...). I will certainly continue on the all-in Linux container app service path; it is fast, reliable, and very flexible.