The formal specification of rules for unsafe code hasn’t been written yet, because, well, it’s an ambitious goal! Even the C standard is sometimes not really clear about what counts as undefined behavior; Rust wants to do better, while being more permissive, and offering a ‘sanitizer’ tool to verify correctness at runtime. And implement this on top of LLVM, which was written by other people, is designed for C’s rules, and, like other compilers, doesn’t even get those right in every case (even when the spec is clear).
For now, the effort is still fairly tentative. But I’m pretty confident that type-based aliasing analysis will never be a thing in Rust, so it will always be legal to read data through ‘wrong-typed’ pointers, both raw pointers and references (as long as it’s valid data, alignment is right, etc.).
Actually, I’m embarrassed: my code from earlier isn’t actually legal in all cases. It requires the pointer to be correctly aligned, which in the case of String it probably will be, but it’s not guaranteed. Meh.
Here's my best understanding of the situation. Someone who actually understands the compiler might have to correct me:
- Pointers in unsafe Rust don't do any strict aliasing optimizations, which C compilers sometimes do. The Rust memory model isn't fully specified, though, and the status quo seems to be related to not actually passing type information to LLVM. Not clear whether this will change in the future. There's some discussion of it here: http://smallcultfollowing.com/babysteps/blog/2016/05/27/the-...
- References in safe Rust (the vast majority of code) have much stronger aliasing information than pointers do in C. This is one of the core features of Rust, that references that allow mutation are guaranteed not to be aliased. I think the status quo is that this information isn't passed to LLVM because of some LLVM bugs getting in the way, but that it should start working in the near future. When all of this is working, I think it should produce code that's faster than C, in the same way that Fortran sometimes does.