My goal is to write general-purpose applications, and to me “absolutely necessary” means “don’t bother unless you know something extra about the target system”. I therefore assume that the suggested approach is to read/write UTF-8 with no regard for target’s locale.
This directly contradicts how base handles file path handling, which I wouldn’t care much about if the tools given were good enough to replicate what base does (with proper error handling, of course). Instead there’s decodeFS, which manages to compress everything I’m trying to escape into a single definition.
There’s no single “you” here.
Library users want to talk about file paths as if they are text, which is fine as long as the conversion is not their direct responsibility.
Library writers know that file paths are not text, and that any introspection more complex than “<NUL>, <period>, <slash>, <newline>, and <carriage-return>” requires a choice of encoding. And this choice is unavoidable when, say, parsing command-line options (think e.g. about bar in --foo=bar).
This divide cannot be bridged without an additional type that’s platform-independent (unlike OsString), but is still possibly erroneous (unlike String/Text).
Both of these are up to library users to decide, squarely outside of the topic of correct file path handling.
Used by… Rust? I have no relevant experience in Rust. When I’m talking about WTF-8 I solely mean how the bytes are laid out in memory, same with UCS2LE.
I hope you’re not getting confused by the fact that Rust has a type called OsString and that one seems to be in WTF-8. Their raw file path types are raw arrays, I guess?
The documentation is lacking.
This RFC looks thoroughly confusing to me. I do think of WTF-8 as “sliceable”, but only within the very narrow limits of the Portable Character Set, which guarantees that the characters are single-byte. This is enough to break down --foo=bar without scrutinizing bar, but that’s the furthest extent of what I found necessary.
Might well be an issue specific to Rust, seems like OsStr is an interface.