Pointers need to be canonical if LAM/UAI is not enabled. The simplest way to do it is to shift left by 16, then shift arithmetic right by 16. (Or 7 if using 5-level paging). Alternatively, you can store the pointer shifted left by 16 bits, and have the tag in the lower 16 bits, then canonicalizing the pointer is just a single shift-arithmetic-right. If combining with NaN-boxing, then you rotate right to recover the double. (Demo: https://godbolt.org/z/MvvPcq9Ej). This is actually more efficient than messing with the high bits directly.
With LAM/UAI, the requirement is that the 63rd bit matches the 47th (or 56th) bit, which gives 15-bits of tag space on LAM48 and 6-bits of tag space on LAM57.
With LAM enabled, care needs to be taken when doing any pointer comparison, as two pointers which point to the same address may not be equal. There have been multiple exploits with LAM, including speculative execution exploits.
Using 16 bits may be risky on recent x86. For example, IIRC Linux enables 5-level page tables on microarchitectures that support it, which can put valid address data in bits 48-56.
There is no guarantee that those 6 bits are safe either. They are just the only bits for which I could not find existing or roadmap usage across x86 and ARM sources when I last did a search.
Linux will not allocate past the 47-bit range, even with 5-level paging enabled, unless specifically requested, by providing a pointer hint to `mmap` with a higher address.
https://www.kernel.org/doc/html/v5.14/x86/x86_64/5level-pagi...
If you assume the top 16 bits of a pointer are unused[1], you can fit a pointer in there. This lets you store a pointer or a full double by-value (and still have tag bits left for other types!).
Last I checked LuaJIT and WebKit both still used this to represent their values.
[1] On amd64 they actually need to be sort of "sign extended" so require some fixup once extracted.