Skip to content

matching bytes regular expression against unicode #1133

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
matthiaskramm opened this issue Apr 3, 2017 · 2 comments
Closed

matching bytes regular expression against unicode #1133

matthiaskramm opened this issue Apr 3, 2017 · 2 comments

Comments

@matthiaskramm
Copy link
Contributor

The following is valid Python 2:

re.compile(b"foo").match(u"foo")

Should typeshed allow this? Right now, https://github.com/python/typeshed/blob/master/stdlib/2/re.pyi defines compile as:

def compile(pattern: AnyStr, flags: int = ...) -> Pattern[AnyStr]: ...

And match(), in Pattern, is defined in https://github.com/python/typeshed/blob/master/stdlib/2/typing.pyi#L350 as:

class Pattern(Generic[AnyStr]):
  def match(self, string: AnyStr) -> Match[AnyStr]: ...

This disallows mixing bytes/unicode. Is this what we want?

Matching a bytes regexp against unicode is a TypeError in Python 3, so there's a case for not supporting this; however we do have some cases on our side were this is flagged as a type error in valid Python 2 code.

@gvanrossum
Copy link
Member

I'm okay with supporting this for PY2.

@JelleZijlstra
Copy link
Member

This is a duplicate of #273.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants