Comment by hyperpape - Hacker Neue

hyperpape Dec 15, 2025 parent

Your comment is sufficiently generic that it’s impossible to tell what specific part of the article you’re agreeing with, disagreeing with, or expanding upon.

vintermann Dec 15, 2025

I disagree that performance should be a reason to choose running numbers over guids until you absolutely have to.

I think IDs should not carry information. Yes, that also means I think UUIDv7 was wrong to squeeze a creation date into their ID.

Isn't that clear enough?

mcny Dec 15, 2025

That's the creation date of that guid though. It doesn't say anything about the entity in question. For example, you might be born in 1987 and yet only get a social security number in 2007 for whatever reason.

So, the fact that there is a date in the uuidv7 does not extend any meaning or significance to the record outside of the database. To infer such a relationship where none exists is the error.

vintermann Dec 15, 2025

You can argue that, but then what is its purpose? Why should anyone care about the creation date of a by-design completely arbitrary thing?

I bet people will extract that date and use it, and it's hard to imagine use which wouldn't be abuse. To take the example of a PN/SSN and the usual gender bit: do you really want anyone to be able to tell that you got a new ID at that time? What could you suspect if a person born in 1987 got a new PN/SSN around 2022?

Leaks like that, bypassing whatever access control you have in your database, is just one reason to use real random IDs. But it's even a pretty good one in itself.

mcny Dec 15, 2025

> What could you suspect if a person born in 1987 got a new PN/SSN around 2022?

Thank you for spelling it for me. For the readers, It leaks information that the person is likely not a natural born citizen. The assumption doesn't have to be a hundred percent accurate, There is a way to make that assumption And possibly hold it against you.

And there are probably a million ways that a record created date could be held against you If they don't put it in writing, how will you prove They discriminated against you.

Thinking... I don't have a good answer to this. If data exists, people will extract meaning from it whether rightly or not.

infogulch Dec 15, 2025

To quote the great Mr Sparrow:

> The only rules that really matter are these: what a man can do and what a man can't do.

When evaluating security matters, it's better to strip off the moral valence entirely ("rightly") and only consider what is possible given the data available.

Another potential concerning implication besides citizenship status: a person changed their id when put in a witness protection program.

majorchord Dec 15, 2025

> You can argue that, but then what is its purpose? Why should anyone care about the creation date of a by-design completely arbitrary thing?

Pretty sure sorting and filtering them by date/time range in a database is the purpose.

miroljub Dec 15, 2025

If you need sorting and filtering by date, just add a timestamp to your table instead of misusing an Id column for that.

6 More Comments →

dpark Dec 15, 2025

That is absolutely not the purpose. The specific purpose of uuidv7 is to optimize for B-Tree characteristics, not so you can craft queries based on the IDs being sequential.

This assumption that you can query across IDs is exactly what is being cautioned against. As soon as you do that, you are talking a dependency on an implementation detail. The contract is that you get a UUID, not that you get 48 bits of timestamp. There are 8 different UUID types and even v7 has more than one variant.

kentm Dec 16, 2025

B-trees too but also bucketing for formats like delta lake or iceberg, where having ids that cluster will reduce the number of files you need to update.

anamexis Dec 15, 2025

I would argue that is one of very few situations where leaking the timestamp that the ID was created when you already have the ID is a possible concern at all.

And when working with very large datasets, there are very significant downsides to large, completely random IDs (which is of course what the OP is about).

kentm Dec 16, 2025

> You can argue that, but then what is its purpose?

The purpose is to reduce randomness while still preserving probability of uniqueness. UUIDv4 come with performance issues when used to bucket data for updates, such as when there used as primary keys in a database.

A database like MySQL or PostgreSQL has sequential ids and you’d use those instead, but if you’re writing something like iceberg tables using Trino/Spark/etc then being able to generate unique ids (without using a data store) that tend to be clustered together is useful.

kube-system Dec 15, 2025

The time component either has meaning and it should be in its own column, or it doesn't have meaning and it is unnecessary and shouldn't be there at all.

I'm not a normalization fanatic, but we're only talking about 1NF here.

hyperpape OP Dec 15, 2025

Those are two unrelated points and the connection between them was unclear in the original post.

hxtk Dec 15, 2025

When I think "premature optimization," I think of things like making a tradeoff in favor of performance without justification. It could be a sacrifice of readability by writing uglier but more optimized code that's difficult to understand, or spending time researching the optimal write pattern for a database that I could spend developing other things.

I don't think I should ignore what I already know and intentionally pessimize the first draft in the name of avoiding premature optimization.

barrkel Dec 15, 2025

UUID v7 doesn't squeeze creation date in. If you treat it as anything other than a random sequence in your applications, you're just wrong.

zamadatix Dec 15, 2025

"What it does" and "what I think you should do with it" should not be treated as equivalent statements.

anamexis Dec 15, 2025

For what it’s worth, it was also completely unclear to me how you were responding to the article itself. It does not discuss natural keys at all.

This item has no comments currently.