Preferences

Definitely a better high level metric than nvidia-smi, and probably fine if you just want to get a very coarse idea of whether or not your are using the GPUs reasonably at all.

But when you get to the point where you care about a few percentage points of utilisation it's just not reliable enough as many things can impact energy consumption both ways. E.g. had a case were the GPU cluster we were using wasn't being cooled well enough, so you would gradually see power draw getting lower and lower as the GPUs were throttling themselves to not overheat.

You can also find cases were energy consumption is high but MFU/HFU isn't, like memory intensive workloads


This item has no comments currently.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal