Skip to content

[SR-6699] Swift Compiler allows invisible Unicode characters in identifiers #49248

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
swift-ci opened this issue Jan 4, 2018 · 4 comments
Open
Labels
bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. compiler The Swift compiler itself

Comments

@swift-ci
Copy link
Contributor

swift-ci commented Jan 4, 2018

Previous ID SR-6699
Radar None
Original Reporter chockenberry (JIRA User)
Type Bug

Attachment: Download

Environment

macOS 10.12.6, Xcode 9.2 (9C40b)

Additional Detail from JIRA
Votes 0
Component/s Compiler
Labels Bug
Assignee None
Priority Medium

md5: 7b4afe08acbfae759bbab496dffa94cc

relates to:

  • SR-738 Some emojis can't be used in identifiers / Unicode 5.2
  • SR-6698 Swift Compiler does not check for canonical equivalence of identifiers that use Unicode

Issue Description:

Zero-width joiners and other invisible Unicode characters can be included in identifiers, allowing code to be obfuscated.

For example, with this code:

let ‌‌‌‌‌⁠‌‌‌⁠‌‌‌⁠‌‌‌‌⁠‌⁠‌‌‌⁠‌‌‌‌‌‌⁠‌‌‌‌⁠‌‌‌‌‌⁠‌⁠‌‌‌‌⁠‌‌‌‌‌‌⁠‌‌‌‌‌‌⁠‌⁠‌‌‌‌⁠‌‌‌‌‌⁠‌‌‌‌⁠‌‌‌‌‌⁠‌⁠‌‌‌‌‌⁠‌‌‌‌⁠‌‌‌⁠‌‌‌‌‌⁠‌‌‌‌‌‌foo = 111111111
let ‌‌‌⁠‌‌‌‌⁠‌‌⁠‌‌⁠‌‌‌⁠‌⁠‌‌‌‌‌⁠‌‌⁠‌‌‌⁠‌‌‌⁠‌‌⁠‌⁠‌‌‌⁠‌‌‌‌⁠‌⁠‌‌⁠‌‌‌⁠‌‌‌⁠‌‌‌⁠‌‌‌⁠‌‌‌‌⁠‌⁠‌‌‌⁠‌‌‌⁠‌‌‌‌‌⁠‌‌‌‌⁠‌‌‌⁠‌⁠‌‌‌⁠‌‌‌‌‌⁠‌⁠‌‌‌⁠‌‌‌‌⁠‌‌‌⁠‌‌⁠‌‌‌‌‌bar = 111111111

let bar = 1

‌‌‌‌‌⁠‌‌‌⁠‌‌‌⁠‌‌‌‌⁠‌⁠‌‌‌⁠‌‌‌‌‌‌⁠‌‌‌‌⁠‌‌‌‌‌⁠‌⁠‌‌‌‌⁠‌‌‌‌‌‌⁠‌‌‌‌‌‌⁠‌⁠‌‌‌‌⁠‌‌‌‌‌⁠‌‌‌‌⁠‌‌‌‌‌⁠‌⁠‌‌‌‌‌⁠‌‌‌‌⁠‌‌‌⁠‌‌‌‌‌⁠‌‌‌‌‌‌foo * ‌‌‌⁠‌‌‌‌⁠‌‌⁠‌‌⁠‌‌‌⁠‌⁠‌‌‌‌‌⁠‌‌⁠‌‌‌⁠‌‌‌⁠‌‌⁠‌⁠‌‌‌⁠‌‌‌‌⁠‌⁠‌‌⁠‌‌‌⁠‌‌‌⁠‌‌‌⁠‌‌‌⁠‌‌‌‌⁠‌⁠‌‌‌⁠‌‌‌⁠‌‌‌‌‌⁠‌‌‌‌⁠‌‌‌⁠‌⁠‌‌‌⁠‌‌‌‌‌⁠‌⁠‌‌‌⁠‌‌‌‌⁠‌‌‌⁠‌‌⁠‌‌‌‌‌bar

A developer could be tricked into thinking that the result is 11111111, instead of the actual result of 12345678987654321.

@swift-ci
Copy link
Contributor Author

swift-ci commented Jan 4, 2018

Comment by Craig Hockenberry (JIRA)

Also worth noting, those invisible characters are encoding a fingerprint in the code. More about that here:

https://meta.stackoverflow.com/questions/361390/can-posts-to-stack-overflow-be-fingerprinted-using-hidden-unicode-characters

@ole
Copy link
Contributor

ole commented Jan 5, 2018

See my comment on SR-6698 for a bit of background about related discussions on swift-evolution.

@belkadan
Copy link
Contributor

belkadan commented Jan 5, 2018

Given that these characters are necessary to properly render some words, I'm not sure there's a 100% solution here. However, it's a case where maybe a 50% solution would still be an improvement.

@ole
Copy link
Contributor

ole commented Jan 8, 2018

@belkadan Yes, a 100% solution is likely impossible. UAX #31 mentions this problem in Section 2.3 Layout and Format Control Characters. It recommends:

The goals for such a restriction of format characters to particular contexts are to:

  • Allow the use of these characters where required in normal text

  • Exclude as many cases as possible where no visible distinction results

  • Be simple enough to be easily implemented with standard mechanisms such as regular expressions

It then gives a few examples where the zero-width joiner and zero-width nonjoiner would still be accepted when they appear in the context of certain scripts.

But I'm not sure if this is an exhaustive recommended list. And invisible characters such as the ZWJ are increasingly used in complex emoji, and UAX #31 doesn't talk about emoji at all (I assume because it — unlike Swift — doesn't consider emoji to be valid identifier characters anyway).

@swift-ci swift-ci transferred this issue from apple/swift-issues Apr 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. compiler The Swift compiler itself
Projects
None yet
Development

No branches or pull requests

3 participants