Based on my limited experience, the performance of running Firefox remotely on a local X11 server was very poor, and I assumed that the absence of these types of acceleration were to blame.
I could imagine XRender to work, though, which would at least support blitting most of the pixels up/down in case of scrolling, and would only require pushing new ones over the network for any newly exposed areas.
My guess is the lack of shared memory buffers stops the use of opengl, and whilst it's theoretically possible, it is probably unimplemented because nobody does that in 2025.
IMHO, widespread use of Xlib lead people to believe that X was much more synchronous than it is, and discouraged people from using networking, and then a lot of stuff that could be possible didn't get made. xcb is a much better way to interact with X, but it may have arrived too late.
Here is an awesome (slightly outdated) talk about the architecture: https://groups.google.com/a/chromium.org/g/blink-dev/c/AK_rw...
The basic idea is that HTML content is drawn in transparent 'tiles' which are layered on top of one another. When the user scrolls, the tiles don't need to be redrawn, but instead just re-composited at their new positions. GPU's are super fast at that, and even a 15 year old GPU can easily do this for tens of layers at 60 FPS.
On a linux with a remote X server, I think the tiles would all end up on the X server, with only the pretty small 'draw tile number 22 at this location' going across the network. So the answer to your question is 'yes'.