Here's some early work in this area which seems promising: https://guanjunwu.github.io/4dgs/
It seems to me they don't use any ML at all. They use backpropagation to jointly optimise the entire physics/motion model, which models camera motion and the generated blurry images (they generate multiple images for each camera frame along the path of motion of the camera, and then merge them, simulating motion blur)
The data they optimise over is just the images of the current camera trajectory (as far as I understand)
The OP paper is cool but isn't alone, here's some concurrent work: https://github.com/SpectacularAI/3dgs-deblur
Also related from a couple years ago, using NeRF methods (another area of current 3D research) to denoise night images and recover HDR: https://bmild.github.io/rawnerf/ NeRF, like Gaussian Splatting, seeks to reconstruct the scene in 3D, and RawNeRF adapts the approach to deal with noisy images as well as large exposure variation.
In terms of Gaussian Splats vs GenAI, usually GenAI models have been trained on a prior of millions of images so that they can impute / inference some part of the 3D scene or some part of the input images. However Gaussian Splats (and NeRF) lack those priors.