-
Notifications
You must be signed in to change notification settings - Fork 346
Panic when parsing a .
in file URLs
#166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
After a quick |
When parsing a url such as file://./foo the parser should not panic.
This lead me to find "interesting" behavior that may or may not be a bug: #171. I’m waiting to hear from Valentin. @alexcrichton Is this blocking anything? If so, we can land a work around in the meantime. |
Ah no this isn't blocking anything on my end, this was just something I found surprising that I ran into at some point (I forget even how at this point...) |
This no longer panics as of the current version of url. extern crate url;
fn main() {
println!("{:?}", url::Url::parse("file://./foo"));
} Bisect shows it was fixed in 9e759f1. |
@dtolnay, One problem is that the spec does NOT mention if VerifyDnsLength is expected to be set for the Conformance Test, or not. Another problem is that, even when VerifyDnsLength (which is the reason for some of the failures) is disabled, there are still cases that the expected fields are missing the leading dots, present in the source field. For example, line 4956, which starts with a U+3002 IDEOGRAPHIC FULL STOP in the source:
I'm going to write up a feedback to the authors and ask for clarification. |
I didn't see feedback regarding this on the UTC (internal) mailing list or the submitted feedbacks, so I just submitted one, as follows. I'm going submit a new PR here and work on a fix for the test data file, as I want to send that to the spec editors. Hi there, We have faced a couple of issues with implementing the UTS #46 Conformance Testing for the rust-url library:
For example, line 169:
(The current implementation of rust-url sets flag VerifyDnsLength because it results in a smaller failure rate for the test data.)
For example, line 4956:
Starting with U+3002 IDEOGRAPHIC FULL STOP, during the Section 4.2 ToASCII algorithm, it should fail at step Processing 4.2, because of the first label having length zero. But no failure is anticipated in the data file. The test data appears to be expecting dropping empty labels (or leading FULL STOPs) from the domain name (which would allow the test cases to pass), but there are’s no step under Section 4 Processing or Section 4.2 ToASCII regarding this behavior. Please see these for original discussion and more info: |
A retry of servo#171 This diff changes the behavior of ToASCII step to match the spec and prevent failures on some cases when a domain name starts with leading dots (FULL STOPs), as requested in servo#166. The change in the code results in a few failures for test cases of the Conformance Testing data provided with UTS servo#46. But, as the header of the test data file (IdnaTest.txt) says: "If the file does not indicate an error, then the implementation must either have an error, or must have a matching result." Therefore, failing on those test cases does not break conformance with UTS servo#46, and to some level, anticipated. As mentioned in servo#166, a feedback is submitted for this inconsistency and the test logic can be improved later if the data file addresses the comments. Until then, we can throw less errors and maintain passing conformance tests with this diff.
A retry of servo#171 This diff changes the behavior of ToASCII step to match the spec and prevent failures on some cases when a domain name starts with leading dots (FULL STOPs), as requested in servo#166. The change in the code results in a few failures for test cases of the Conformance Testing data provided with UTS servo#46. But, as the header of the test data file (IdnaTest.txt) says: "If the file does not indicate an error, then the implementation must either have an error, or must have a matching result." Therefore, failing on those test cases does not break conformance with UTS servo#46, and to some level, anticipated. As mentioned in servo#166, a feedback is submitted for this inconsistency and the test logic can be improved later if the data file addresses the comments. Until then, we can throw less errors and maintain passing conformance tests with this diff. To keep the side-effects of ignoring errors during test runs as minimum as possible, I have separated `TooShortForDns` error from `TooLongForDns`. The `Error` struct has been kept private, so the change won't affect any library users.
A retry of servo#171 This diff changes the behavior of ToASCII step to match the spec and prevent failures on some cases when a domain name starts with leading dots (FULL STOPs), as requested in servo#166. The change in the code results in a few failures for test cases of the Conformance Testing data provided with UTS servo#46. But, as the header of the test data file (IdnaTest.txt) says: "If the file does not indicate an error, then the implementation must either have an error, or must have a matching result." Therefore, failing on those test cases does not break conformance with UTS servo#46, and to some level, anticipated. As mentioned in servo#166, a feedback is submitted for this inconsistency and the test logic can be improved later if the data file addresses the comments. Until then, we can throw less errors and maintain passing conformance tests with this diff. To keep the side-effects of ignoring errors during test runs as minimum as possible, I have separated `TooShortForDns` error from `TooLongForDns`. The `Error` struct has been kept private, so the change won't affect any library users. Fix servo#166
A retry of servo#171 This diff changes the behavior of ToASCII step to match the spec and prevent failures on some cases when a domain name starts with leading dots (FULL STOPs), as requested in servo#166. The change in the code results in a few failures for test cases of the Conformance Testing data provided with UTS servo#46. But, as the header of the test data file (IdnaTest.txt) says: "If the file does not indicate an error, then the implementation must either have an error, or must have a matching result." Therefore, failing on those test cases does not break conformance with UTS servo#46, and to some level, anticipated. As mentioned in servo#166, a feedback is submitted for this inconsistency and the test logic can be improved later if the data file addresses the comments. Until then, we can throw less errors and maintain passing conformance tests with this diff. To keep the side-effects of ignoring errors during test runs as minimum as possible, I have separated `TooShortForDns` error from `TooLongForDns`. The `Error` struct has been kept private, so the change won't affect any library users. Fix servo#166
A retry of servo#171 This diff changes the behavior of ToASCII step to match the spec and prevent failures on some cases when a domain name starts with leading dots (FULL STOPs), as requested in servo#166. The change in the code results in a few failures for test cases of the Conformance Testing data provided with UTS servo#46. But, as the header of the test data file (IdnaTest.txt) says: "If the file does not indicate an error, then the implementation must either have an error, or must have a matching result." Therefore, failing on those test cases does not break conformance with UTS servo#46, and to some level, anticipated. As mentioned in servo#166, a feedback is submitted for this inconsistency and the test logic can be improved later if the data file addresses the comments. Until then, we can throw less errors and maintain passing conformance tests with this diff. To keep the side-effects of ignoring errors during test runs as minimum as possible, I have separated `TooShortForDns` error from `TooLongForDns`. The `Error` struct has been kept private, so the change won't affect any library users. Fix servo#166
A retry of servo#171 This diff changes the behavior of ToASCII step to match the spec and prevent failures on some cases when a domain name starts with leading dots (FULL STOPs), as requested in servo#166. The change in the code results in a few failures for test cases of the Conformance Testing data provided with UTS servo#46. But, as the header of the test data file (IdnaTest.txt) says: "If the file does not indicate an error, then the implementation must either have an error, or must have a matching result." Therefore, failing on those test cases does not break conformance with UTS servo#46, and to some level, anticipated. As mentioned in servo#166, a feedback is submitted for this inconsistency and the test logic can be improved later if the data file addresses the comments. Until then, we can throw less errors and maintain passing conformance tests with this diff. To keep the side-effects of ignoring errors during test runs as minimum as possible, I have separated `TooShortForDns` error from `TooLongForDns`. The `Error` struct has been kept private, so the change won't affect any library users. Fix servo#166
This issue was discussed on the UTC list. Based on this and other feedback about the Apparently one of the problems in the reference implementation was using Java's regex Anyways, new data file is here: http://unicode.org/Public/idna/10.0.0/IdnaTest.txt There are many failures here with this data file, which I'll try to fix and submit a full PR. Unicode 10.0.0 will be release in about one month from now. So, in the meanwhile, we can still land #337 , or just wait until the day the new data file becoming official. |
A retry of servo#171 This diff changes the behavior of ToASCII step to match the spec and prevent failures on some cases when a domain name starts with leading dots (FULL STOPs), as requested in servo#166. The change in the code results in a few failures for test cases of the Conformance Testing data provided with UTS servo#46. But, as the header of the test data file (IdnaTest.txt) says: "If the file does not indicate an error, then the implementation must either have an error, or must have a matching result." Therefore, failing on those test cases does not break conformance with UTS servo#46, and to some level, anticipated. As mentioned in servo#166, a feedback is submitted for this inconsistency and the test logic can be improved later if the data file addresses the comments. Until then, we can throw less errors and maintain passing conformance tests with this diff. To keep the side-effects of ignoring errors during test runs as minimum as possible, I have separated `TooShortForDns` error from `TooLongForDns`. The `Error` struct has been kept private, so the change won't affect any library users. Fix servo#166
A retry of servo#171 This diff changes the behavior of ToASCII step to match the spec and prevent failures on some cases when a domain name starts with leading dots (FULL STOPs), as requested in servo#166. The change in the code results in a few failures for test cases of the Conformance Testing data provided with UTS servo#46. But, as the header of the test data file (IdnaTest.txt) says: "If the file does not indicate an error, then the implementation must either have an error, or must have a matching result." Therefore, failing on those test cases does not break conformance with UTS servo#46, and to some level, anticipated. As mentioned in servo#166, a feedback is submitted for this inconsistency and the test logic can be improved later if the data file addresses the comments. Until then, we can throw less errors and maintain passing conformance tests with this diff. To keep the side-effects of ignoring errors during test runs as minimum as possible, I have separated `TooShortForDns` error from `TooLongForDns`. The `Error` struct has been kept private, so the change won't affect any library users. Fix servo#166
A retry of servo#171 This diff changes the behavior of ToASCII step to match the spec and prevent failures on some cases when a domain name starts with leading dots (FULL STOPs), as requested in servo#166. The change in the code results in a few failures for test cases of the Conformance Testing data provided with UTS servo#46. But, as the header of the test data file (IdnaTest.txt) says: "If the file does not indicate an error, then the implementation must either have an error, or must have a matching result." Therefore, failing on those test cases does not break conformance with UTS servo#46, and to some level, anticipated. As mentioned in servo#166, a feedback is submitted for this inconsistency and the test logic can be improved later if the data file addresses the comments. Until then, we can throw less errors and maintain passing conformance tests with this diff. To keep the side-effects of ignoring errors during test runs as minimum as possible, I have separated `TooShortForDns` error from `TooLongForDns`. The `Error` struct has been kept private, so the change won't affect any library users. Fix servo#166
[idna] Update data to Unicode 10.0 and fix logic * Change the behavior of ToASCII step to match the spec and prevent failures on some cases when a domain name starts with leading dots (FULL STOPs), as requested in #166. (Another attempt on #337 and #171) * Update `IdnaTest.txt` file to UCD 10.0 and fix Validation Rules, specially Bidi Rules, for the tests to pass. * Add TODO marks for new flags introduced in Unicode 10.0 version of UTS#46. (http://www.unicode.org/reports/tr46/proposed.html) * Add integration test for `rust-url` crate for the new behavior. Fix #166 <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/rust-url/351) <!-- Reviewable:end -->
Parsing the url
file://./foo
will cause rust-url to panic, for example:yields:
The text was updated successfully, but these errors were encountered: