Interesting. So ioeventfd is also handled in userspace, I guess.
A couple years ago I measured a huge slowdown on userspace vmexits for guests spanning multiple NUMA nodes, because of cacheline bouncing on tsk->sighand->siglock. Maybe you're not using KVM_SET_SIGNAL_MASK.
(Steve, I suppose?)
ioeventfd for PIO exits is still handled in the kernel, but that one is easy since it's a dedicated VMEXIT type.
We do very little that typically requires trapping MMIO, particularly in places that are performance sensitive (VIRTIO Net and VIRTIO SCSI do not, and honestly there's not too much that guests do inside GCE that isn't either disk or networking :).
(Yet-another-Googler: I worked on this and spoke about it at KVM Forum)