Preferences

bayindirh parent
If "make -j" successfully drowns a machine, I can argue that the machine has no serious bottlenecks for the job. Because, make is generally I/O bound when run with high parallelism, and if you can't saturate your I/O bandwidth, that's a good thing in general.

However, if "make -j" is saturates a machine, and this is unintentional, I'd assume PEBKAC, or "holding it wrong", in general.


davemp
The problem is ‘make -j’ spinning up 100s of C++ compilation jobs, using up all of the systems RAM+swap, and causing major instability.

I get that the OS could mitigate this, but that’s often not an option in professional settings. The reality is that most of the time users are expecting ‘make -j $(N_PROC)’, get bit in the ass, and then the GNU maintainers say PEBKAC—wasting hundreds of hours of junior dev time.

dspillett
> The problem is ‘make -j’ spinning up 100s of C++ compilation jobs, using up all of the systems RAM+swap, and causing major instability.

I would put that in the “using it improperly” category. I never use⁰ --jobs without specifying a limit.

Perhaps there should have been a much more cautious default instead of the default being ∞, maybe something like four¹, or even just 2, and if people wanted infinite they could just specify something big enough to encompass all the tasks that could possibly run in the current process. Or perhaps --load-average should have defaulted to something like min(2, CPUs×2) when --jobs was in effect⁴.

The biggest bottleneck hit when using --jobs back then wasn't RAM or CPU though, it was random IO on traditional high-latency drives. A couple of parallel jobs could make much better use of even a single single-core CPU, by the CPU-crunching of a CPU-busy task or two and the IO of other tasks ending up parallel, but too many concurrent tasks would result in an IO flood that could practically stall the affected drives for a time, putting the CPU back into a state of waiting ages for IO (probably longer than it would be without multiple jobs running) - this would throttle a machine² before it ran out of RAM even with the small RAM we had back then compared to today. With modern IO and core counts, I can imagine RAM being the bigger issue now.

--------

[0] Well, used, I've not touched make for quite some time

[1] Back when I last used make much at all small USB sticks and SD cards were not uncommon, but SSDs big++quick+hardy enough for system or work drives were an expensive dream. With frisby-based drives I found a four job limit was often a good compromise, approaching but not hitting significantly diminishing returns if you had sufficient otherwise unused RAM, while keeping a near-zero chance of effectively completely stalling the machine due to a flood of random IO.

[2] Or every machine… I remember some fool³ bogging down the shared file server of most of the department with a vast parallel job, ignoring the standing request to run large jobs on local filesystems where possible anyway.

[3] Not me, I learned the lesson by DoSing my home PC!

[4] Though in the case of causing an IO storm on a remote filesystem, a load-average limit might be much less effective.

davemp
Thanks for the historical perspective. It probably was less of an issue on older hardware because you can ctrl-c if you’re IO starved. Linux user spaces do not do well when the OOM killer comes out to play.

Personally, I don’t think these footguns need to exist.

dspillett
Though in the shared drive example, only the host causing the problem can have ctrl+c done to solve it. Running something on the file server to work out the culprit (by checking the owner of the files being accessed for instance) will be pretty much blocked behind everything else affected by the IO storm.
bayindirh OP
I’ll kindly disagree on wasting junior developer time. Any person who’s using tools professionally should read (or at least skim) the manual of the said tool. Especially, if it’s something foundational to their all workflow.

They are junior because they are inexperienced, but being junior is the best place to make mistakes and learn good habits.

If somebody asks what is the most important thing I have learnt over the years, I’d say “read the manual and the logs”.

davemp
There’s a difference between understanding your tool and unnecessary cognitive load.

Make does not provide a sane way to run in parallel. You shouldn’t have to compose a command that parses /proc/cpuinfo to get the desired behavior of “fully utilize my system please”. This is not a detail that is particularly relevant to conditional compilation/dependency trees.

This feels like it’s straight out of the Unix Haters Handbook.

[0]: https://web.mit.edu/~simsong/www/ugh.pdf see p186

duped
It's trivial to go OOM on a modern dev machine with -j$(nproc) these days because of parallel link jobs. Make is never the bottleneck, it's just the trigger.

This item has no comments currently.