I totally agree about Azure being the worst of the three, they wanted us to commit to certain use before even buying hardware themselves. Crazy…
But I also had capacity issues with Google at large scales in many zones.
One gameserver was 40vCPU and 256G of RAM, we had about 30-50 before we’d see some issues in some regions. (this is from memory now unfortunately).
Sao Paulo and Tokyo being the worst, but Singapore, Australia and Mumbai also had issues at various times.
The other places where we hit hard limits was Los Angeles, but we had more than a hundred instances then.
The issue with the hard limits is that it’ll be one zone thats exceeded and the API will fail, so you have to retry with another zone in the same region, but you don’t get to practice building your autoscaler before you actually need it.