I think IDs should not carry information. Yes, that also means I think UUIDv7 was wrong to squeeze a creation date into their ID.
Isn't that clear enough?
So, the fact that there is a date in the uuidv7 does not extend any meaning or significance to the record outside of the database. To infer such a relationship where none exists is the error.
I bet people will extract that date and use it, and it's hard to imagine use which wouldn't be abuse. To take the example of a PN/SSN and the usual gender bit: do you really want anyone to be able to tell that you got a new ID at that time? What could you suspect if a person born in 1987 got a new PN/SSN around 2022?
Leaks like that, bypassing whatever access control you have in your database, is just one reason to use real random IDs. But it's even a pretty good one in itself.
Thank you for spelling it for me. For the readers, It leaks information that the person is likely not a natural born citizen. The assumption doesn't have to be a hundred percent accurate, There is a way to make that assumption And possibly hold it against you.
And there are probably a million ways that a record created date could be held against you If they don't put it in writing, how will you prove They discriminated against you.
Thinking... I don't have a good answer to this. If data exists, people will extract meaning from it whether rightly or not.
> The only rules that really matter are these: what a man can do and what a man can't do.
When evaluating security matters, it's better to strip off the moral valence entirely ("rightly") and only consider what is possible given the data available.
Another potential concerning implication besides citizenship status: a person changed their id when put in a witness protection program.
Pretty sure sorting and filtering them by date/time range in a database is the purpose.
This assumption that you can query across IDs is exactly what is being cautioned against. As soon as you do that, you are talking a dependency on an implementation detail. The contract is that you get a UUID, not that you get 48 bits of timestamp. There are 8 different UUID types and even v7 has more than one variant.
And when working with very large datasets, there are very significant downsides to large, completely random IDs (which is of course what the OP is about).
The purpose is to reduce randomness while still preserving probability of uniqueness. UUIDv4 come with performance issues when used to bucket data for updates, such as when there used as primary keys in a database.
A database like MySQL or PostgreSQL has sequential ids and you’d use those instead, but if you’re writing something like iceberg tables using Trino/Spark/etc then being able to generate unique ids (without using a data store) that tend to be clustered together is useful.
I'm not a normalization fanatic, but we're only talking about 1NF here.
I don't think I should ignore what I already know and intentionally pessimize the first draft in the name of avoiding premature optimization.