Comment by porridgeraisin

porridgeraisin Mar 19, 2025 parent

"I cook up impractical situations and then blame my tools for it"

Nobody cares that valid filenames are anything except the null byte and /. Tell me one valid usecase for a non-UTF8 filename.

coldpie Mar 19, 2025

UTF-8 is common now, but it hasn't always been. Wanting support for other encoding schemes is a valid ask (though, I think the OP was needlessly rude about it).

porridgeraisin OP Mar 19, 2025

It's backwards compatible with ascii right?

But yeah I suppose you would need support for all the other foreign-language encodings that came in between -- UCS-2 for example.

But basically nobody does that. Glib (which drives all GTK apps' and various other apps file reading) doesn't support anything other than UTF8 filenames. At that point I'd consider the "migration" done and dusted.

coldpie Mar 19, 2025

The world is a lot more complicated & varied than you think :) I was digging around in some hard drives from 2004 just last weekend. At that time, lots of different encodings were common, especially internationally. Software archaelogy is a common hobby, it could be nice to be able to use a tool like this to search through old filesystems. "Not worth the effort" is definitely a valid response to the feature request, but that also doesn't mean there is absolutely no use for the feature.

jcranmer Mar 19, 2025

I can definitely see a use case for supporting non-UTF-8 pathnames on disk (primarily for archaeological purposes).

In a UTF-8-path-only world, what I would do is have a mount option that says that the pathnames on disk are Latin-1 (so that \xff is mapped to U+00FF in UTF-8, which I'm too lazy to work its exact binary representation right now), and let the people doing archaeology on that write their own tools to remap the resulting mojibake pathnames into more readable ones. Not the cleanest solution, but there are ways to support non-UTF-8 disks even with UTF-8-only pathnames.

porridgeraisin OP Mar 20, 2025

Oh yeah I can imagine the pain for drives from that era. I remember reading that sometimes you need the right "codebook" - what was the word - installed and stuff like that.

creeble Mar 19, 2025

You do not have (or write programs for) filesystems that contain loads of ancient mp3 and wma files.

It is the bane of my existence, but many programs support all the Latin-1 and other file name encodings that are incompatible with UTF-8, so users expect _your_ programs to work too.

Now if you want me to actually _display_ them all correctly, tough turds...

porridgeraisin OP Mar 20, 2025

True. Btw curious, is there a defined encoding for text in mp3 metadata? Or is that a pain too.

shakna Mar 19, 2025

Running a shell script went badly, generating a bunch of invalid files containing random data in their names, rather than one file containing random data.

You wish to find and delete them all, now that they've turned your home directory into a monstrosity.

dleink Mar 19, 2025

nah, eff all that. Roll back the snapshot.

This item has no comments currently.

Preferences

Keyboard Shortcuts

Story Lists

Navigation

Miscellaneous