Profile: ynik - Hacker Neue

ynik

Joined Oct 25, 2013 1,014 karma

ynik Dec 3, 2025 parent

`s[0] == 'h'` isn't sufficient to guarantee that `s[3]` can be access without a segfault, so the compiler is not allowed to perform this optimization.
If you use `&` instead of `&&` (so that all array elements are accessed unconditionally), the optimization will happen: https://godbolt.org/z/KjdT16Kfb
(also note you got the endianness wrong in your hand-optimized version)
ynik Nov 26, 2025 parent

C# doesn't erase all generics; but there's also some type erasure happening: nullable reference types, tuple element names, and the object/dynamic distinction are all not present in .NET bytecode; these are only stored in attributes for public signatures, but are erased for local variable types.
C# also has huge amounts of syntactic sugar: `yield return` and `await` compile into huge state machines; `fixed` statements come with similar problems as "finally" in java (including the possibility of exponential code growth during decompilation).
ynik Nov 25, 2025 parent

Windows NT started supporting unicode before UTF-8 was invented, back when Unicode was fundamentally 16-bit. As a result, in Microsoft world, WCHAR meant "supports unicode" and CHAR meant "doesn't support unicode yet".
By the way, UTF-16 also didn't exist yet: Windows started with UCS-2. Though I think the name "UCS-2" also didn't exist yet -- AFAIK that name was only introduced in Unicode 2.0 together with UCS-4/UTF-32 and UTF-16 -- in Unicode 1.0, the 16-bit encoding was just called "Unicode" as there were no other encodings of unicode.
ynik Oct 15, 2025 parent

Autovectorization / unrolling can maybe still be handed with a couple of additional tests. The main problem I see with doing branch coverage on compiled machine code is inlining: instead of two tests for one branch, you now need two tests for each function that a copy of the branch was inlined into.
ynik Oct 7, 2025 parent

Using a second partition D: is already twice as fast at small-file-writes compared to the system partition C:. This was on Windows 10 with both partitions using NTFS on the same SSD, and everything else on default settings.
This is because C: has additional file system filters for system integrity that are not present on other drives; and also because C: has 8.3 name compatibility enabled by default, but additional NTFS partitions have that disabled. (so if your filenames are longer than the 8.3 format, Windows has to write twice the number of directory entries)
ynik Oct 2, 2025 parent

Only if you measure the life cycle starting from the initial release.
Windows 10 dropped out of support only 3 years after its successor (Win11) was available; when Windows 8.1 still had 7 more years of support after its successor (Win10) was released.
There's a lot of users who never upgrade windows but instead just get whatever is the latest whenever they buy a new computer. If these people bought a new computer every 5 years, they were always fine in the past, but now for the first time run out of support (because Win10 was "the latest" for an unusually long time period).
ynik Aug 22, 2025 parent

Python 3 internally uses UTF-32. When exchanging data with the outside world, it uses the "default encoding" which it derives from various system settings. This usually ends up being UTF-8 on non-Windows systems, but on weird enough systems (and almost always on Windows), you can end up with a default encoding other than UTF-8. "UTF-8 mode" (https://peps.python.org/pep-0540/) fixes this but it's not yet enabled by default (this is planned for Python 3.15).
ynik Jul 10, 2025 parent

You are not testing what you think you are testing.
"let &mut a2 = &mut a;" is pattern-matching away the reference, so it's equivalent to "let a2 = a;". You're not actually casting a mutable reference to a pointer, you're casting the integer 13 to a pointer. Dereferencing that obviously produces UB.
If you fix the program ("let a2 = &mut a;"), then Miri accepts it just fine.
ynik Jun 27, 2025 parent

But adding 1 to a pointer will add sizeof(T) to the underlying value, so you actually need to reserve more than two addresses if you want to distinguish the "past-the-end" pointer for every object from NULL.
--
While it's rare to find a platform nowadays that uses something other than a zero bit pattern for NULL as normal pointer type; it's extremely common in C++ for pointer-to-member types: 0 is the first field at the start of a struct (offset 0); and NULL is instead represented with -1.
ynik Jun 26, 2025 parent

Under your interpretation, neither gcc nor clang are POSIX compliant. Because in practice all these optimizing compilers will reorder memory accesses without bothering to prove that the pointers involved are valid -- the compiler just assumes that the pointers are valid, which is justified because otherwise the program would have undefined behavior.
ynik Jun 23, 2025 parent

That applies only if you take "memory model" to mean modeling the effects of concurrent accesses in multithreaded programs.
But the term could also be used more generally to include stuff like pointer provenance, Rust's "stacked borrows" etc. In that case, Rust is more complicated than C-as-specified. But C-in-reality is much more complicated, e.g. see https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2263.htm
ynik May 16, 2025 parent

Bytecode instructions have never been atomic in Python's past. It was always possible for the GIL to be temporarily released, then reacquired, in the middle of operations implemented in C. This happens because C code is often manipulating the reference count of Python objects, e.g. via the `Py_DECREF` macro. But when a reference count reaches 0, this might run a `__del__` function implemented in Python, which means the "between bytecode instructions" thread switch can happen inside that reference-counting-operation. That's a lot of possible places!
Even more fun: allocating memory could trigger Python's garbage collector which would also run `__del_-` functions. So every allocation was also a possible (but rare) thread switch.
The GIL was only ever intended to protect Python's internal state (esp. the reference counts themselves); any extension modules assuming that their own state would also be protected were likely already mistaken.
ynik Apr 24, 2025 parent

Are there any embedded compilers left that try to implement their own C++ frontend? To me it looks like everyone gave up on that and uses the clang/gcc/EDG frontends.
ynik Mar 27, 2025 parent

At least on Debian, installing the `atop` package will automatically install a background service running atop as root. (by default, logging some stats to /var/log/atop/ every ten minutes)
ynik Mar 18, 2025 parent

Destructors/drop have issues though:
* cannot return errors/throw exceptions * cannot take additional parameters (and thus do not play well with "access token" concepts like pyo3's `Python` token that proves the GIL was acquired -- requiring the drop implementation to re-acquire the lock just in case)
I think `defer` would be a better language construct than destructors, if it's combined with some kind of linear types that produce compiler errors if there is some code path that does not move/destroy the object.
ynik Mar 18, 2025 parent

Even the trivial specification "given a valid input, the program exits successfully with some arbitrary output" will already get you very far: to prove this trivial specification as correct, you need to prove the absence of panics, undefined behavior or infinite loops. Doing that for a complex program will require formulating useful invariants and proving those. So even if using the trivial specification for the overall program, you'll likely find that the proof will require you to write actually useful specifications for several components of your program (because the program likely contains code that relies on the correctness of the component, in order to prove absence of crashes in the code using the component).
ynik Feb 24, 2025 parent

Codegen bugs are not particularly rare either; but you usually run into them if doing "weird stuff" (which hits an edge case somewhere within the compiler). And the first instinct of most C++ programmers when seeing weird compiler behavior is to assume their weird code somehow triggered undefined behavior, so they refactor their program until it's less weird. But then it usually also no longer hits the edge case in the compiler's logic, so the program starts working correctly. Most developers then don't spend additional hours/days to investigate whether it was truly undefined behavior or if they hit a compiler bug.
ynik Dec 16, 2024 parent
The crucial bit for Vec::drain is in these two lines of code, which the article lists but does not discuss:
```
        // set self.vec length's to start, to be safe in case Drain is leaked
        self.set_len(start);
```
You cannot rely on Drop alone for safety, you need a fallback strategy in case of a leak (mem::forget or Rc cycle). Rust lifetimes only ensure that there is no use after free; there is no guarantee that Drop is called before the end of the lifetime. It's possible to mutably access the original Vec after leaking a Drain.
ynik Dec 13, 2024 parent

The big differences are: 1. Rust closures are by-value structs; whereas Java closures are heap objects. 2. Rust generics are monomorphized; whereas Java type-erases them -> lots of virtual call overhead when passing a closure to a generic function.
Sometimes, if the Java JIT manages to inline absolutely everything, it can optimize away these overheads. But in practice, Rust FP gets optimized a lot more reliably than Java FP.
ynik Dec 11, 2024 parent

Probably because they did not think of this special case when writing the standard, or did not find it important enough to consider complicating the standard text for.
In C89, there's just a general provision for all standard library functions:
> Each of the following statements applies unless explicitly stated otherwise in the detailed descriptions that follow. If an argument to a function has an invalid value (such as a value outside the domain of the function, or a pointer outside the address space of the program, or a null pointer), the behavior is undefined. [...]
And then there isn't anything on `memcpy` that would explicitly state otherwise. Later versions of the standard explicitly clarified that this requirement applies even to size 0, but at that point it was only a clarification of an existing requirement from the earlier standard.
People like to read a lot more intention into the standard than is reasonable. Lots of it is just historical accident, really.
ynik Dec 3, 2024 parent

You misunderstood. N, M are supposed to be integers (const generics); in your example code you've made them types. Also, your `type Output = Foo<<N as Add<M>>::Output>;` just means "multiplication has the same return type as addition". But desired is that multiplying a Foo<4> with a Foo<3> results in a Foo<7>.
Rust decided that it's important to not have instantiation-time compiler errors, but this makes computation with const generics complicated: you're not allowed to write Foo<{N+M}> because N+M might overflow, even if it never actually overflows for the generic arguments used in the program.
ynik Nov 25, 2024 parent

That only works for C++ code using C++20 modules (i.e. for approximately nothing). With textual includes, you need to be able to switch back and forth the edition within a single compilation unit.
ynik Oct 25, 2024 parent

The main big philosophical difference regarding templates is that Rust wants to guarantee that generic instantiation always succeeds; whereas C++ is happy with instantiation-time compiler errors. The C++ approach does make life a fair bit easier and can maybe even avoid some of the lifetime annotation burden in some cases: in Rust, a generic function may need a `where T: 'static` constraint; in C++ with lifetimes it could be fine without any annotations as long as it's never instantiated with structs containing pointers/references.
Template specializations are not in Rust because they have some surprisingly tricky interactions with lifetimes. It's not clear lifetimes can be added to C++ without having the same issue causing safety holes with templates. At least I think this might be an issue if you want to compile a function instance like `void foo<std::string_view>()` only once, instead of once for each different string data lifetime.
ynik Oct 10, 2024 parent

The really horrible bufferfloat usually happens when the upload bandwidth is saturated -- upload bandwidth tends to be lower so it'll cause more latency for the same buffer size. I used to have issues with my cable modem, where occasionally the upload bandwidth would drop to ~100kbit/s (from normally 5Mbit/s), and if this tiny upload bandwidth was fully used, latency would jump from the normal 20ms to 5500ms. My ISP's customer support (Vodafone Germany) refused to understand the issue and only wanted to upsell me on a plan with more bandwidth. In the end I relented and accepted their upgrade offer because it also came with a new cable modem, which fixed the issue. (back then ISPs didn't allow users to bring their own cable modem -- nowdays German law requires them to allow this)
ynik Sep 25, 2024 parent

> You can send half-random input in and then send more half-random input in until you’re satisfied that the RNG has gotten a suitable amount of entropy.
This does not actually work. If an attacker can observe output of the CSPRNG, and knows the initial state (when it did not yet have enough entropy), then piecemeal addition of entropy allows the attacker to bruteforce what the added entropy was. To be safe, you need to add a significant amount of entropy at once, without allowing the attacker to observe output from an intermediate state. But after you've done that, you won't ever need to add entropy again.
ynik Sep 23, 2024 parent

Not sure if you dropped a "/s".
In my experience, C++ template usage will always expand until all reasonably available compile time is consumed.
Rust doesn't have C++'s header/implementation separation, so it's easy to accidentally write overly generic code. In C++ you'd maybe notice "do I really want to put all of this in the header?", but in Rust your compile times just suffer silently. On the other hand, C++'s lack of a standardized build system led to the popularity of header-only libraries, which are even worse for compile times.
ynik Sep 17, 2024 parent

Even cooler is that it's possible to create an infinite-layer gzip file: https://honno.dev/gzip-quine/
ynik Sep 3, 2024 parent

You can still install Windows 11 without a Microsoft account. It requires configuring the installation before you boot from the USB stick.
I use https://rufus.ie/en/ when creating bootable USB sticks, and it turns out that this tool detects when you're trying to create a Windows installation medium, and prompts with a list of useful customizations, including "Remove requirement for online Microsoft account". (if you look through the screenshots on the webpage, there's one with the Windows customization dialog box)

This user hasn’t submitted anything.

Preferences

Keyboard Shortcuts

Story Lists

Navigation

Miscellaneous