Skip to content

Unify the docs of std::env::{args_os, args} more #84551

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 26, 2021
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions library/std/src/env.rs
Original file line number Diff line number Diff line change
Expand Up @@ -751,8 +751,8 @@ pub fn args() -> Args {
/// extension. This allows `std::env::args_os` to work even in a `cdylib` or `staticlib`, as it
/// does on macOS and Windows.
///
/// Note that the returned iterator will not panic during iteration if any argument to the
/// process is not valid Unicode. For more safety,
/// Note that the returned iterator will not check if the arguments to the
/// process are valid Unicode. To ensure UTF-8 validity,
Copy link
Contributor

@jnqnfe jnqnfe Apr 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit late to the table here, happening to stumble upon this just after it's merged, but I must state that I find both original and updated changes here to be flawed.

Filenames and paths, which of course are sometimes provided as arguments to a program, are not guaranteed to be valid UTF-8 on some platforms/filesystems. Consequently there is a possibility of arguments representing real filenames/paths genuinely being non-UTF-8 strings. Use of args to collect arguments that take filenames/paths would obviously result in a crash for such inputs, blocking users from potentially using real files/paths with that program.

In theory any argument (or env-var), as external input to the program, could potentially be invalid UTF-8, and simply crashing in the face of this, rather than gracefully exiting with a clear error where appropriate, is really not good behaviour.

The benefit of args over args_os is simplicity for cases where for instance your app does not take any filename/path arguments, and you don't care if invalid UTF-8 argument inputs cause a crash. It is great for beginners getting to grips with the basics of playing with argument handling.

The benefit of args_os over args is that it allows you to "correctly" handle all possible filenames/paths for such arguments, and allows you to more gracefully handle situations of otherwise unexpected invalid UTF-8 input, at the expense of the minor complication of having to handle OsString/OsStr types. It could be viewed as the better/right choice for those with more knowledge and experience.

Thus the choice is nothing to do with safety exactly, as already pointed out, but also is not really about "ensuring UTF-8 validity" either. It's about (a) simplicity - avoiding you having to check for and handle non-UTF-8 situations as above, (b) whether or not you care about properly supporting possibly non-UTF-8 filename/path arguments, (c) whether or not you care about ungraceful crashes in the face of non-UTF-8 input otherwise.

This is not what is conveyed by this modified documentation. It might be best, really, for some form of what I have just written to be placed into the documentation to properly inform those trying to make a choice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to keep it short so maybe it can be changed to something simple like If you want to panic on invalid UTF-8, use the [`args`] function instead.? This sentence plainly states what happens and doesn't mention "safety" or "validity".
But maybe it could still be nice to have that additional information you provided, so if you want to add that please open a PR. But if you think the sentence I just suggested suffices, I would be happy to make another PR changing that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that that sentence is certainly much better, but I feel that there is significant value in actually explaining the situation regarding what I wrote above, such that people reading it can make a properly informed choice. I'll try to take a look at further improving the text later. :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, we have the exact same case for vars, vars_os and var, var_os too, don't we? I think you are making a really good point and it doesn't just apply to args and args_os so because this is more general I think what you are saying would be a good fit for maybe the docs of std::env? Maybe you could mention those 6 methods at the bottom there and discuss them, rather than copy-paste to every of those 6 methods.

/// use the [`args`] function instead.
///
/// # Examples
Expand Down