The server can decide that the client is offline if the server misses expected heartbeat messages from the client. But how often will those be sent and how long grace period will we allow? If it's too short then it will be unreliable on shaky 4G connections, if it's too long then it will be annoying in the other direction.
And that's not considering the "social" problems with locks. I've worked on replacing a system that was lock-based with CRDTs where the lunch scenario from MontagFTB actually was a common occurrence.
In an "ideal" scenario your lock acquisition problem is not hard. Client's just show the UI optimistically and whoever the server decide was first gets the lock. The loosing client throws any state the user created in the short time-frame. Over reliable and fast connections for granular locks, this works fine. But in the real world that's just one of the issues with a lock based approach…
It turns out you just have to pick one... This all depends on a source of truth, and when you are there it's easy to pick one, say based on whichever arrived at the network interface first.
For however long the user has a browser focus state on the element seems like a reasonable answer, and submit changes as they are made. However, I don't know how you resolve conflicts of two users simultaneously attempting to acquire a lock.