Comment by NobodyNada

NobodyNada Dec 4, 2025 parent

Let's throw this into godbolt: https://clang.godbolt.org/z/qW3qx13qT

    is_divisible_by_6(int):
        test    dil, 1
        jne     .LBB0_1
        imul    eax, edi, -1431655765
        add     eax, 715827882
        cmp     eax, 1431655765
        setb    al
        ret
    .LBB0_1:
        xor     eax, eax
        ret

    is_divisible_by_6_optimal(int):
        imul    eax, edi, -1431655765
        add     eax, 715827882
        ror     eax
        cmp     eax, 715827883
        setb    al
        ret

By themselves, the mod 6 and mod 3 operations are almost identical -- in both cases the compiler used the reciprocal trick to transform the modulo into an imul+add+cmp, the only practical difference being that the %6 has one extra bit shift.

But note the branch in the first function! The original code uses the && operator, which is short-circuiting -- so from the compiler's perspective, perhaps the programmer expects that x % 2 will usually be false, and so we can skip the expensive 3 most of the time. The "suboptimal" version is potentially quite a bit faster in the best case, but also potentially quite a bit slower in the worst case (since that branch could be mispredicted). There's not really a way for the compiler to know which version is "better" without more context, so deferring to "what the programmer wrote" makes sense.

That being said, I don't know that this is really a case of "the compiler knows best" rather than just not having that kind of optimization implemented. If we write 'x % 6 && x % 3', the compiler pointlessly generates both operations. And GCC generates branchless code for 'is_divisible_by_6', which is just worse than 'is_divisible_by_6_optimal' in all cases.

senfiaj Dec 4, 2025

I also tried this

  bool is_divisible_by_15(int x) {
      return x % 3 == 0 && x % 5 == 0;
  }

  bool is_divisible_by_15_optimal(int x) {
      return x % 15 == 0;
  }

is_divisible_by_15 still has a branch, while is_divisible_by_15_optimal does not

  is_divisible_by_15(int):
        imul    eax, edi, -1431655765
        add     eax, 715827882
        cmp     eax, 1431655764
        jbe     .LBB0_2
        xor     eax, eax
        ret
  .LBB0_2:
        imul    eax, edi, -858993459
        add     eax, 429496729
        cmp     eax, 858993459
        setb    al
        ret

  is_divisible_by_15_optimal(int):
        imul    eax, edi, -286331153
        add     eax, 143165576
        cmp     eax, 286331153
        setb    al
        ret

This item has no comments currently.

Preferences

Keyboard Shortcuts

Story Lists

Navigation

Miscellaneous