The real benefit is not using 6x network bandwidth, storage, memory, processing power and more battery of the mobile device. That benefit is not going anywhere, no matter what.
Post-processing is applied to the signal which is physically impossible to distinguish from the source. It is true that it often needs higher resolution, and DSPs will upsample internally and then back and operate on floats. But to claim without evidence, that post-processing may give human listener back the ability to tell apart whether 192/24 medium was used instead of 48/16, would be to reintroduce the same quality-loss paranoia, just with an extra step. If one couldn't hear the difference before an effect was applied...they won't hear it after.
As for DJs, they do use high-res assets when producing mixes. That's still mastering stage, technically.
Dithering (or more bits) does solve for this. A fade out of the song also lowers the captured noise floor, but the dither function keeps going.
It's akin to noticing occasional posterization (banding) in very dark scenes if your TV isn't totally crushing the blacks. With a higher than recommended black level, you will see this artifact, because perceptual video codecs destroy (for efficiency purposes) the visual dither that would otherwise soften the bands of dark color into a nice grainy halftone sort of thing which would be much less offensive.
That’s why producers (mixing many tracks in a session) want to use high bit rate stems, because they are summing the noise from n tracks.
It’s a pointless exercise for DJs or anyone listening to a single source to use a higher bit depth.
I think that "if" is doing a heavy work here.
However, any type of subsequent processing in the digital domain, even just a volume change by the listener if it's applied digitally in the 16 bit realm (i.e., without first upscaling to 24 bits), completely destroys the benefit of dithering. For that reason, we might say that additional processing isn't confined to the recording studio and can happen at the end user level.
I'm unsure whether this same logic applies to sampling frequency, but probably? I guess post-mastering processing of amplitude is far more common than time-based changes, but maybe DJs doing beat matching?