If you have a PG cluster with 1 writer, 2 readers, 10Ti of storage and 16k provision IOPs (io1/2 has better latency than gp3), you pay for 30Ti and 48k PIOPS without redundancy or 60Ti and 96k PIOPS with multi-AZ.
The same Aurora setup you pay for 10Ti and get multi-AZ for free (assuming the same cluster setup and that you've stuck the instances in different AZs).
I don't want to figure the exact numbers but iirc if you have enough storage--especially io1/2--you can end up saving money and getting better performance. For smaller amounts of storage, the numbers don't necessarily work out.
There's also 2 IO billing modes to be aware of. There's the default pay-per-IO which is really only helpful for extreme spikes and generally low IO usage. The other mode is "provisioned" or "storage optimized" or something where you pay a flat 30% of the instance cost (in addition to the instance cost) for unlimited IO--you can get a lot more IO and end up cheaper in this mode if you had an IO heavy workload before
I'd also say Serverless is almost never worth it. Iirc provisioning instances was ~17% of the cost of serverless. Serverless only works out if you have ~ <4 hours of heavy usage followed by almost all idle. You can add instances fairly quickly and failover for minimal downtime (of course barring running into the bug the article describes...) to handle workload spikes using fixed instance sizes without serverless
[0] https://dev.to/aws-heroes/100k-write-iops-in-aurora-t3medium...
I believe the article is talking about I/O aggregate operations vs I/O average per second. I'm talking strictly about the "average per second" variety. The former is really only relevant for billing in the standard billing mode.
Actually a big motivator for the migration was batch writes (we generate tables in Snowflake, export to S3, then import from S3 using the AWS RDS extension) and Aurora (with ability to handle big spikes) helped us a lot. We'd see application performance (query latency reported by APM) increase a decent amount during these bulk imports and it was much less impactful with Aurora.
iirc it was something like 4-5ms to 10-12ms query latency for some common queries regularly and during import respectively with RDS PG and more like 6-7ms during import on Aurora (mainly because we were exhausting IO during imports before)
Do you have a problem believing these claims on equivalent hardware?: https://pages.cs.wisc.edu/~yxy/cs764-f20/papers/aurora-sigmo...
Or do your own performance assessments, following the published document and templates available so you can find the facts on your own?
For Aurora MySql:
"Amazon Aurora Performance Assessment Technical Guide" - https://d1.awsstatic.com/product-marketing/Aurora/RDS_Aurora...
For Aurora Postgres:
"...Steps to benchmark the performance of the PostgreSQL-compatible edition of Amazon Aurora using the pgbench and sysbench benchmarking tools..." - https://d1.awsstatic.com/product-marketing/Aurora/RDS_Aurora...
"Automate benchmark tests for Amazon Aurora PostgreSQL" - https://aws.amazon.com/blogs/database/automate-benchmark-tes...
"Benchmarking Amazon Aurora Limitless with pgbench" - https://aws.amazon.com/blogs/database/benchmarking-amazon-au...
We have some clusters with very high write IOPS on Aurora.
When looking at costs we modelled running MySQL and regular RDS MySQL.
We found for the IOPS capacity of Aurora we wouldn't be able to match it on AWS without paying a stupid amount more.
Blatant plug time:
I'm actually working for a company right now ( https://pgdog.dev/ ) that is working on proper sharding and failovers from a connection pooler standpoint. We handle failovers like this by pausing write traffic for up to 60 seconds by default at the connection pooler and swapping which backend instance is getting traffic.
Max throughput on gp3 was recently increased to 2GB/s, is there some way I don't know about of getting 3.125?
> General Purpose SSD (gp3) - Throughput > gp3 supports a max of 4000 MiBps per volume
But the docs say 2000. Then there's IOPS... The calculator allows up to 64.000 but on [0], if you expand "Higher performance and throughout" it says
> Customers looking for higher performance can scale up to 80,000 IOPS and 2,000 MiBps for an additional fee.
I think 80k IOPs on gp3 is a newer release so presumably AWS hasn't updated RDS from the old max of 64k. iirc it took a while before gp3 and io2 were even available for RDS after they were released as EBS options
Edit: Presumably it takes some time to do testing/optimizations to make sure their RDS config can achieve the same performance as EBS. Sometimes there are limitations with instance generations/types that also impact whether you can hit maximum advertised throughput