Skip to content

Handle win32 separator for cygwin paths #141864

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

Berrysoft
Copy link
Contributor

@Berrysoft Berrysoft commented Jun 1, 2025

This PR handles a issue that cygwin actually supports Win32 path, so we need to handle the Win32 prefix and separaters.

r? @mati865

cc @jeremyd2019

Not sure if I should handle the prefix like the windows target... Cygwin does support win32 paths directly going through the APIs, but I think it's not the recommended way.

Here I just use cygwin_conv_path because it handles both cygwin and win32 paths correctly and convert them into absolute POSIX paths.

UPDATE: Windows path prefix is handled.

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jun 1, 2025
@jieyouxu jieyouxu added the O-cygwin Target: *-pc-cygwin label Jun 1, 2025
@rust-log-analyzer

This comment has been minimized.

Copy link
Contributor

@jeremyd2019 jeremyd2019 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I just use cygwin_conv_path because it handles both cygwin and win32 paths correctly and convert them into absolute POSIX paths.

I was going to raise the issue of encoding, but looking at the code it doesn't seem to deal with WIN_A as a different encoding than POSIX paths, so I guess that's not an issue. I guess I need to have a conversation on the Cygwin list to better understand this.

@Berrysoft
Copy link
Contributor Author

Finally I reuse the prefix parser from Windows. Now the cygwin Path can handle Win32 path correctly. It should behave a little different as it allows / as verbatim separater. Here is a simple test code:

use std::path::{absolute, Path};

fn main() {
    test_path("C:\\msys64");
    test_path("/dev/sda1");
    test_path("./sda1");
    test_path("/c/Users");
    test_path("\\\\?\\UNC/localhost/share");
}

fn test_path(p: impl AsRef<Path>) {
    let p = p.as_ref();
    let abs = absolute(p).unwrap();
    if p.is_absolute() {
        println!("{} == {}", p.display(), abs.display());
    } else {
        println!("{} -> {}", p.display(), abs.display());
    }
    println!("{:?}", p.components().collect::<Vec<_>>());
}

On Windows, it should output (the current drive is D:)

C:\msys64 == C:\msys64
[Prefix(PrefixComponent { raw: "C:", parsed: Disk(67) }), RootDir, Normal("msys64")]
/dev/sda1 -> D:\dev\sda1
[RootDir, Normal("dev"), Normal("sda1")]
./sda1 -> <<<current dir>>>\sda1
[CurDir, Normal("sda1")]
/c/Users -> D:\c\Users
[RootDir, Normal("c"), Normal("Users")]
\\?\UNC/localhost/share == \\?\UNC/localhost/share
[Prefix(PrefixComponent { raw: "\\\\?\\UNC/localhost/share", parsed: VerbatimUNC("localhost/share", "") })]

While on Cygwin, it should output

C:\msys64 == /c/msys64
[Prefix(PrefixComponent { raw: "C:", parsed: Disk(67) }), RootDir, Normal("msys64")]
/dev/sda1 == /dev/sda1
[RootDir, Normal("dev"), Normal("sda1")]
./sda1 -> <<<current dir>>>/sda1
[CurDir, Normal("sda1")]
/c/Users == /c/Users
[RootDir, Normal("c"), Normal("Users")]
\\?\UNC/localhost/share == //?/UNC/localhost/share
[Prefix(PrefixComponent { raw: "\\\\?\\UNC/localhost/share", parsed: VerbatimUNC("localhost", "share") })]

@Berrysoft
Copy link
Contributor Author

@jeremyd2019

I was going to raise the issue of encoding, but looking at the code it doesn't seem to deal with WIN_A as a different encoding than POSIX paths, so I guess that's not an issue. I guess I need to have a conversation on the Cygwin list to better understand this.

Actually rust doesn't handle very well for non-UTF-8 encoding on Unix. OsStr and Path are all UTF-8 (because they could be constructed from a UTF-8 str without copy) and they are passed to OS APIs directly without encoding conversion. The only concerns, I think, should be command line input and console input. Do you know that whether Cygwin handles the encoding conversion of argv and stdin? I hope it does:)

@Berrysoft
Copy link
Contributor Author

OK, seems that the command line is converted from UTF-16: https://github.com/cygwin/cygwin/blob/972872c0d396dbf0ce296b8280cee08ce7727b51/winsup/cygwin/dcrt0.cc#L902

And stdin handles the encoding conversion: https://github.com/cygwin/cygwin/blob/972872c0d396dbf0ce296b8280cee08ce7727b51/winsup/cygwin/fhandler/pty.cc#L3935

So I think all inputs are UTF-8 for a Cygwin program, and the developers could not construct a non-UTF-8 Win32 path if they aren't aware of encodings. If they are, they should know that Rust codes are UTF-8:)

@Berrysoft
Copy link
Contributor Author

Now I'm a little worried about cygwin_conv_path then. If the maintainers someday find that it should handle code page encodings, the code here will be messed up:(

@pietroalbini
Copy link
Member

This was originally reported by @Ry0taK through Rust's security disclosure process, and since this affects a still-in-development tier 3 target (with very few users right now) we agreed to let the development of the fix happen in the open.

@mati865
Copy link
Member

mati865 commented Jun 2, 2025

This change is T-libs area, so it will require their review.

r? rust-lang/libs

I'm not familiar with Cygwin API at all, but I see what you want to achieve here, and I think it's sensible.

Now I'm a little worried about cygwin_conv_path then. If the maintainers someday find that it should handle code page encodings, the code here will be messed up:(

They wouldn't want to break that API as a normal release, right? I'd expected a documented API to remain backwards compatible.

@pietroalbini I'm out of the loop, could you shed some light on your comment?

@rustbot rustbot assigned tgross35 and unassigned mati865 Jun 2, 2025
@mati865
Copy link
Member

mati865 commented Jun 2, 2025

Also, there is a typo in both the title and the first commit 😉

@Berrysoft Berrysoft changed the title Handle win32 separater for cygwin paths Handle win32 separator for cygwin paths Jun 2, 2025
@jeremyd2019
Copy link
Contributor

So I think all inputs are UTF-8 for a Cygwin program, and the developers could not construct a non-UTF-8 Win32 path if they aren't aware of encodings. If they are, they should know that Rust codes are UTF-8:)

Cygwin's encoding may not always be UTF-8 - one can set the locale environment variables like they can on Unix to set the encoding, and then the conversions of paths and terminal input will use that encoding instead of UTF-8.

@Berrysoft
Copy link
Contributor Author

In such locale, Rust doesn't even work on Linux. That's fair:)

@pietroalbini
Copy link
Member

pietroalbini commented Jun 3, 2025

@pietroalbini I'm out of the loop, could you shed some light on your comment?

@mati865 this was originally reported to [email protected] as it could defeat path traversal mitigations like this:

if p.components().into_iter().any(|x| x == Component::ParentDir) {
    panic!("path traversal");
}

By only parsing unix-like paths, it would be possible to sneak in a path traversal using win32 paths. Given the current status of the target we agreed it would be ok to develop the fix in the open.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
O-cygwin Target: *-pc-cygwin S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants