Skip to content

percent-encoding: make sets be values of one type, instead of types that implement a trait #519

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 17, 2019

Conversation

SimonSapin
Copy link
Member

@SimonSapin SimonSapin commented Jul 17, 2019

Fix #388


This change is Reviewable

@SimonSapin SimonSapin force-pushed the percent-encoding-consts branch from bc4503b to 7f1bd6c Compare July 17, 2019 15:26
@nox
Copy link
Contributor

nox commented Jul 17, 2019

@bors-servo r+

Niceè

@bors-servo
Copy link
Contributor

📌 Commit 7f1bd6c has been approved by nox

@bors-servo
Copy link
Contributor

⌛ Testing commit 7f1bd6c with merge 4bcb39c...

bors-servo pushed a commit that referenced this pull request Jul 17, 2019
percent-encoding: make sets be values of one type, instead of types that implement a trait

Fix #388

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/rust-url/519)
<!-- Reviewable:end -->
@bors-servo
Copy link
Contributor

💔 Test failed - checks-travis

@SimonSapin
Copy link
Member Author

@bors-servo r=nox

Raised minimum Rust version

@bors-servo
Copy link
Contributor

📌 Commit a1fe49e has been approved by nox

@bors-servo
Copy link
Contributor

⌛ Testing commit a1fe49e with merge 4e38c16...

bors-servo pushed a commit that referenced this pull request Jul 17, 2019
percent-encoding: make sets be values of one type, instead of types that implement a trait

Fix #388

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/rust-url/519)
<!-- Reviewable:end -->
@bors-servo
Copy link
Contributor

☀️ Test successful - checks-travis
Approved by: nox
Pushing 4e38c16 to master...

@bors-servo bors-servo merged commit a1fe49e into master Jul 17, 2019
@SimonSapin SimonSapin deleted the percent-encoding-consts branch July 17, 2019 15:54
@upsuper
Copy link

upsuper commented Aug 26, 2019

Ah, you removed PATH_SEGMENT_ENCODE_SET from percent-encoding... It would have been better if you could add a migration guide for that, as it was a very important part of this crate as far as I can see (because why one would do percent-encoding at all?)

@nox
Copy link
Contributor

nox commented Aug 26, 2019

It would have been better if you could add a migration guide for that

We don't have the resources to maintain such a guide.

@nox
Copy link
Contributor

nox commented Aug 26, 2019

A quick look at the diff shows that it was just renamed to PATH_SEGMENT.

@upsuper
Copy link

upsuper commented Aug 26, 2019

A quick look at the diff shows that it was just renamed to PATH_SEGMENT.

... besides it is moved to a different crate and no longer exposed.

@nox
Copy link
Contributor

nox commented Aug 26, 2019

... besides it is moved to a different crate and no longer exposed.

I see. One could write a patch to expose it again from the percent_encoding crate I guess.

@upsuper
Copy link

upsuper commented Aug 26, 2019

If you would accept such a patch, I can try to write it I guess.

@nox
Copy link
Contributor

nox commented Aug 26, 2019

I think Simon removed it because he didn't expect it to have any users, but if there are I don't really see a reason to reject such patch. You may want to confirm that with him though.

@SimonSapin
Copy link
Member Author

See discussion in #529:

What is the context of the code you’re writing? If it’s URL-related, does the url crate not already have a method that takes care of percent-encoding for you? For example PathSegmentsMut::push. If it’s not URL-related, is PATH_SEGMENT_ENCODE_SET really exactly the set you need?

If you’re implementing some spec, the removal of various pre-defined sets from the precent-encoding is intended to nudge you toward reading that spec and defining your own set based on it.

@upsuper
Copy link

upsuper commented Aug 26, 2019

The context is that I need a url as a string, and I have a hard-coded string template for it already.

It's true that PathSegmentsMut::push can be used, but going through Url means I need to write and execute more code with no benefit other than that I don't need to find a replacement of PATH_SEGMENT_ENCODE_SET.

The related code can be found here: https://github.com/upsuper/telegram-rustevalbot/blob/984a2a1b7c8d7631e74ef23001a717912b3f212d/src/cratesio/mod.rs#L124-L128

@nox
Copy link
Contributor

nox commented Aug 26, 2019

@SimonSapin

I wonder if we could provide a way to extract the PathSegmentsMut::push to its own data type not backed by a Url but a simple string.

@upsuper

The benefit of not exposing those sets directly and going through Url is that that's one less thing to maintain in the url and percent_encoding crates, and less chances for your code to not do the right things in case the WHATWG URL Standard changes. I don't think executing more code in the context of a Rust evaluation bot over Telegram is going to hurt any performance.

@upsuper
Copy link

upsuper commented Aug 26, 2019

It's not going to hurt any performance... it's just an extra annoyance to use it for this simple case.

@upsuper
Copy link

upsuper commented Aug 26, 2019

For justification, compare the following code:

fn generate_url(base_url: &str, name: &str) -> String {
    let mut url = Url::parse(base_url).unwrap();
    url.path_segments_mut().unwrap().push(name);
    url.to_string()
}
let crate_url = generate_url("https://crates.io/crates", name);
let doc_url = generate_url("https://docs.rs/crate", name);

vs.

let name_url = utf8_percent_encode(name, PATH_SEGMENT_ENCODE_SET);
let crate_url = format!("https://crates.io/crates/{}", name_url);
let doc_url = format!("https://docs.rs/crate/{}", name_url);

My intuition is that the first piece of code is more error-prone and dangerous than the second given the extra unwrap()s. That's the annoyance in my mind. It's probably not that bad, but I don't think it's unreasonable to prefer the second over the first where available.

@SimonSapin
Copy link
Member Author

The other alternative is:

use percent_encoding::{utf8_percent_encode, AsciiSet, CONTROLS};

/// https://url.spec.whatwg.org/#fragment-percent-encode-set
const FRAGMENT: &AsciiSet = &CONTROLS.add(b' ').add(b'"').add(b'<').add(b'>').add(b'`');

/// https://url.spec.whatwg.org/#path-percent-encode-set
const PATH: &AsciiSet = &FRAGMENT.add(b'#').add(b'?').add(b'{').add(b'}');

utf8_percent_encode(name, PATH)

Additionally, per https://doc.rust-lang.org/cargo/reference/manifest.html#the-name-field crates.io only allows ASCII alphanumeric characters and - and _ in package names, which don’t need to be percent-encoded.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

percent_encoding::EncodeSet cannot be boxed
4 participants