-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
℅ does not encode as a domain name (Python's built-in idna encoding is insufficient) #19
Comments
Count me among the surprised as well. Looking at the idna package (a possible dependency of hyperlink in the future), it says: It's possible the builtin idna encoding is dropping the ball hard. |
Here's another one:
|
The built-in IDNA encoder seems fairly weak. |
I switched to the IDNA module to check input data and it's working far better. (It's also slower, but correct trumps fast.) |
This may be of interest to you, by the way: I'm still fleshing it out. |
More info: so I can't create a URL with
If I replace So, unless these hostnames are actually not valid, I'm in favor of using the |
Haha, from the very first report I had a suspicion hypothesis was at work. The idna package was already on the table in my mind due to the split between IDNA 2003 and 2008, along with some hyperlink API changes, but I'll bump it up the priority list even further after this onslaught. |
:-) thanks |
IDNA 2008 is strictly the right thing to use now. Just an FYI. Requests already uses it. ;) |
Ah, I thought we were still in some partial state where different TLDs were adhering to different standards. If not, all the better. :) |
I took a quick stab at it:
But a couple of tests don't pass:
These are both the apparently same thing; which is an error trying to decode
Looks like the
That suggests the test is wrong? |
It may well be! I don't recall writing it, so it's probably from Twisted's doctests which were probably just tested against the built-in behavior. I'm working on a local branch to get the decoding straightened out, and once we have correct behavior we'll see about profiling and idna performance possibilities. |
Cool, thanks for looking into it |
poke… this is challenging to work around when you hit it (for me, anyway). |
Acknowledged. Alas, if only core python had used hypothesis... Anyways, I'm going to rename this slightly to represent the domain name/IDNA-specific-ness of this error and once #39 is out of the way, hopefully we can integrate the idna package soon. |
Awesome, thanks. I'll try to file a bug against Python next week. |
I'm not entirely sure this is a bug, but it sure seems like one:
The cause is probably in the IDNA encoding:
This surprises me, but I might just be ignorant about how this works…?
The text was updated successfully, but these errors were encountered: