Comment by dinvlad - Hacker Neue

dinvlad 2 days ago parent

And now, on cloud it’s the same but much more expensive and worse performance. We’ve been struggling for over a month to get a single (1) non-beefy non-GPU VM allocated on Azure, since they’ve been having insane capacity issues, to the extent that even “provisioned” capacity cannot be fulfilled ;-(

lazide 2 days ago

Sure, but that’s because Azure. I’m sorry someone made the decision to go there. AWS & GCP, stock outs at least used to be nearly unheard of.

dijit 2 days ago

Until you hit a certain scale.

I totally agree about Azure being the worst of the three, they wanted us to commit to certain use before even buying hardware themselves. Crazy…

But I also had capacity issues with Google at large scales in many zones.

lazide 1 day ago

What sort of scale, if you don’t mind me asking?

dijit 1 day ago

Hey, sure! That’s important context.

One gameserver was 40vCPU and 256G of RAM, we had about 30-50 before we’d see some issues in some regions. (this is from memory now unfortunately).

Sao Paulo and Tokyo being the worst, but Singapore, Australia and Mumbai also had issues at various times.

The other places where we hit hard limits was Los Angeles, but we had more than a hundred instances then.

The issue with the hard limits is that it’ll be one zone thats exceeded and the API will fail, so you have to retry with another zone in the same region, but you don’t get to practice building your autoscaler before you actually need it.

lazide 1 day ago

Oh interesting. Yeah, and I’m guessing you don’t get much prior visibility into available stock, since that would also expose info to their competitors.

That is a lot lower than I expected, but I also imagine that’s a sizable order that they like getting.

solatic 1 day ago

If you're a top X customer running in a smaller region of AWS or GCP, yes, you need to do capacity planning with your TAM. You get to a point where quota increase requests are not auto-approved.

This item has no comments currently.