That only works, if you know exactly, that the low bits are constant. Otherwise you may run into the issue that due to unsteady rate of RDTSC between two processes/threads that may be preemptively unscheduled between reading the HPTC and the RDTSC you might again end up with colliding time stamps. And if you took the derivative between timestamps taken in succession, you might even find is to be non-monotonic in places.
The combination of multiple counters incremented by individual unsteady clocks used to be a source for pseudo random scrambler sequences; these days we prefer LFSRs, but overall this is something that can be weird.
Hence my recommendation: Just throw xxHash32 on concatenation of the HPTC's low bits and the CPU clock cycle counter, and forgo any pretense of monotony in the low bits (because very likely you don't have it anyway).
The combination of multiple counters incremented by individual unsteady clocks used to be a source for pseudo random scrambler sequences; these days we prefer LFSRs, but overall this is something that can be weird.
Hence my recommendation: Just throw xxHash32 on concatenation of the HPTC's low bits and the CPU clock cycle counter, and forgo any pretense of monotony in the low bits (because very likely you don't have it anyway).