Preferences

Dynamic types have classically used the lower bits freed by alignment constraints. If I know a cons cell is 16 bytes then I can use the low 4 bits of an address to store enough type info to disambiguate.

There's a technique known as "NaN boxing" which exploits the fact double precision floats allow you to store almost 52 bits of extra data in what would otherwise be NaNs.

If you assume the top 16 bits of a pointer are unused[1], you can fit a pointer in there. This lets you store a pointer or a full double by-value (and still have tag bits left for other types!).

Last I checked LuaJIT and WebKit both still used this to represent their values.

[1] On amd64 they actually need to be sort of "sign extended" so require some fixup once extracted.

> On amd64 they actually need to be sort of "sign extended" so require some fixup once extracted.

Pointers need to be canonical if LAM/UAI is not enabled. The simplest way to do it is to shift left by 16, then shift arithmetic right by 16. (Or 7 if using 5-level paging). Alternatively, you can store the pointer shifted left by 16 bits, and have the tag in the lower 16 bits, then canonicalizing the pointer is just a single shift-arithmetic-right. If combining with NaN-boxing, then you rotate right to recover the double. (Demo: https://godbolt.org/z/MvvPcq9Ej). This is actually more efficient than messing with the high bits directly.

With LAM/UAI, the requirement is that the 63rd bit matches the 47th (or 56th) bit, which gives 15-bits of tag space on LAM48 and 6-bits of tag space on LAM57.

With LAM enabled, care needs to be taken when doing any pointer comparison, as two pointers which point to the same address may not be equal. There have been multiple exploits with LAM, including speculative execution exploits.

Apologies, there's a mistake in the godbolt link above. `SIGN_BIT` should be `0x8000` and not `0x1000`.
If you restrict yourself to all variants of x86 and ARM, the number of high bits for which I could not find conflicting uses is 6 bits (bits 57-62). The other high bits are reserved in some hardware contexts and therefore may create conflicts.

Using 16 bits may be risky on recent x86. For example, IIRC Linux enables 5-level page tables on microarchitectures that support it, which can put valid address data in bits 48-56.

There is no guarantee that those 6 bits are safe either. They are just the only bits for which I could not find existing or roadmap usage across x86 and ARM sources when I last did a search.

> Using 16 bits may be risky on recent x86. For example, IIRC Linux enables 5-level page tables on microarchitectures that support it, which can put valid address data in bits 48-56.

Linux will not allocate past the 47-bit range, even with 5-level paging enabled, unless specifically requested, by providing a pointer hint to `mmap` with a higher address.

https://www.kernel.org/doc/html/v5.14/x86/x86_64/5level-pagi...

Ah, thanks for the detail! I was unaware that this was how it worked.
There's numerous techniques used. Many are covered in Gudeman's 1993 paper "Representing Type Information in Dynamically Typed Languages"[1], which includes low-bits tagging, high-bits tagging, and NaN-boxing.

The high bits let us tag more types, and can be used in conjunction with low bits tagging. Eg, we might use the low bits for GC marking.

[1]:https://web.archive.org/web/20170705085007/ftp://ftp.cs.indi...

Depends on the architecture. Top bit usage lets you do what the hardware thinks if as an 'is negative' check for very cheap on a lot of archs for instance.
Is it a guarantee that a 16 byte object would be 16 byte aligned?
Not in general, but it is a guarantee a runtime where all allocation are 16 byte cons cells can choose to make quite trivially.
For memory allocation, POSIX (posix_memalign) has been guaranteeing alignment since 2001. C11 added equivalent functionality (aligned_alloc). C++17 incorporated it (std::aligned_alloc) as well.
More importantly, C++17 no longer ignores alignment in dynamic memory allocation: https://en.cppreference.com/w/cpp/memory/new/operator_new

C++11 already had alignas, but it was not really integrated well.

If you implement malloc you can do that. The os generally gives you 4k (or other number in that range) at a time and malloc subdivides it.

language runtimes can call malloc whatever they want.

In C++ you can force that with alignas(), I would imagine other low level languages offer something similar.

If you're using a custom allocator you'd have to enfore it yourself which should be fine since you have full control.

https://en.cppreference.com/w/cpp/language/alignas.html

C23 also has `alignas` and `alignof` (`_Alignas`/`_Alignof` in C11 with the lowercase as macros in stdalign.h), and also provides `aligned_alloc` and `free_aligned_size` in stdlib.
Dynamic languages usually come with their own memory manager. They can come up with their own alignment constraints. That being said, most contemporary (Linux) architectures require that malloc returns 16 byte alignned pointers. Some mallocs only promise this for allocations larger than 8 bytes, though (and I think the C standard was updated to permit that).
No. It depends on the object.

This item has no comments currently.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal