Are you not using a caching registry mirror, instead pulling the same image from Hub for each runner...? If so that seems like it would be an easy win to add, unless you specifically do mostly hot/unique pulls.
The more efficient answer to those rate limits is almost always to pull less times for the same work rather than scaling in a way that circumvents them.
From a performance / efficiency perspective, we generally recommend using ECR Public images[0], since AWS hosts mirrors of all the "Docker official" images, and throughput to ECR Public is great from inside AWS.
Any pulls doing this become zero cost for docker hub
Any sort of cache you put between docker hub and your own infra would probably be S3 backed anyway, so adding another cache in between could be mostly a waste
I'm slightly old; is that the same thing as a ramdisk? https://en.wikipedia.org/wiki/RAM_drive
The last bit (emphasis added) sounds novel to me, I don't think I've heard before of anybody doing that. It sounds like an almost-"free" way to get a ton of performance ("almost" because somebody has to figure out the sizing. Though, I bet you could automate that by having your tool export a "desired size" metric that's equal to the high watermark of tmpfs-like storage used during the CI run)
Linux page cache exists to speed up access to the durable store which is the underlying block device (NVMe, SSD, HDD, etc).
The RAM-backed block device in question here is more like tmpfs, but with an ability to use the disk if, and only if, it overflows. There's no intention or need to store its whole contents on the durable "disk" device.
Hence you can do things entirely in RAM as long as your CI/CD job can fit all the data there, but if it can't fit, the job just gets slower instead of failing.
Consider a scenario where your VM has 4GB of RAM, but your build accesses a total of 6GB worth of files. Suppose your code interacts with 16GB of data, yet at any moment, its active working set is only around 2GB. If you preload all Docker images at the start of your build, they'll initially be cached in RAM. However, as your build progresses, the kernel will begin evicting these cached images to accommodate recently accessed data, potentially even files used infrequently or just once. And that's the key bit, to force caching of files you know are accessed more than once.
By implementing your own caching layer, you gain explicit control, allowing critical data to remain persistently cached in memory. In contrast, the kernel-managed page cache treats cached pages as opportunistic, evicting the least recently used pages whenever new data must be accommodated, even if this new data isn't frequently accessed.
I believe various RDBMSs bypass the page cache and use their own strategies for managing caching if you give them access to raw block devices, right?
Everyone Linux kernel does that already. I currently have 20 GB of disk cached in RAM on this laptop.
- Configured a block-level in-memory disk accelerator / cache (fs operations at the speed of RAM!)
- Benchmarked EC2 instance types (m7a is the best x86 today, m8g is the best arm64)
- "Warming" the root EBS volume by accessing a set of priority blocks before the job starts to give the job full disk performance [0]
- Launching each runner instance in a public subnet with a public IP - the runner gets full throughput from AWS to the public internet, and IP-based rate limits rarely apply (Docker Hub)
- Configuring Docker with containerd/estargz support
- Just generally turning kernel options and unit files off that aren't needed
[0] https://docs.aws.amazon.com/ebs/latest/userguide/ebs-initial...