titzer parent
wasm3 uses tailcalls to implement its interpreter bytecode handlers and it manages to successfully force tail-call optimization in both gcc and llvm. Worth having a look on how it does that?
It's possible, but for GCC requires optimizations to be enabled. MSVC is completely off-limits in that approach.