Skip to content

CS2 Discussion: Output: Ambiguity between math assignment operator and regular expression literals #4943

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
coffeescriptbot opened this issue Feb 19, 2018 · 8 comments

Comments

@coffeescriptbot
Copy link
Collaborator

From @JimPanic on 2016-09-29 13:26

While sifting through the lexer code, I found that there's a regular expression called REGEX_ILLEGAL that is executed in the method @regexToken to catch this ambiguity between regular expression literals and the division-assign operator /=:

# Coffee # -> JS
console.log /; a /g # -> console.log(/; a/g); 
console.log /= b /g # -> console.log /= b /g;

So basically you cannot pass a regular expression to a function omitting the parentheses to match part of string starting with a =.

I didn't even know this operator existed in either JS or CS. Do people use this? This is a really nasty, hard to track down behaviour that you could shoot yourself in the foot with. There should be either very explicit rules OR a big fat warning if we find something like identifier<space>/=<space><something>/<something> in the code. Because it can mean more than one thing:

  • function call with a valid regex identifier(/= something/i)
  • function call with an invalid regex identifier(/= something) (syntax error)
  • valid division assignment to the identifier identifier /= something/something
  • invalid division assignment to the identifier if this second something is not a valid identifier identifier /= something/ (error: unexpected end of input, but is actually a valid regex)

JS does not have this ambiguity because it doesn't allow to omit the parentheses.

Personally, I'd like to see a clear, specific and comprehensible specification of behaviour to this other than "what the lexer does in that particular version". It could go so far as to throw an error when trying to call a function with a regular expression literal of that format as the first argument linking to the docs that states this ambiguity.

Thoughts?

PS: I suppose any changes in this regard would be breaking changes and I would also rather see that in the 2 branch if at all.

@coffeescriptbot
Copy link
Collaborator Author

From @lydell on 2016-09-29 14:34

So basically you cannot pass a regular expression to a function omitting the parentheses to match part of string starting with a =.

Not really true. See - https://github.com/jashkenas/coffeescript/wiki/Common-Gotchas#q-why-is-foo--ag-or-foo--ag-different-from-foo-ag-or-foo-ag-

I didn't even know this operator existed in either JS or CS. Do people use this?

I certainly have used it, but not as often as -= and +=.

JS does not have this ambiguity because it doesn't allow to omit the parentheses.

This. This is the problem.

Personally, I'd like to see a clear, specific and comprehensible specification of behaviour to this other than "what the lexer does in that particular version".

See #3782 (especially have a look at the tests added by that PR).

It could go so far as to throw an error when trying to call a function with a regular expression literal of that format as the first argument linking to the docs that states this ambiguity.

Do you want to forbid this:

foo /= a/g

And require either of these instead?

foo /= (a/g)
foo /\= a/g

@coffeescriptbot
Copy link
Collaborator Author

From @JimPanic on 2016-10-03 06:10

Not really true. See - https://github.com/jashkenas/coffeescript/wiki/Common-Gotchas#q-why-is-foo--ag-or-foo--ag-different-from-foo-ag-or-foo-ag-

Aha! I could've known this had a history. :)

And require either of these instead?

Either is fine but I'd personally lean towards foo /\= a/g.

Why I even created this issue is: it is a pitfall, and it's one of these you're going to debug for hours on end. The compiler knows this and should tell the user right away.

@coffeescriptbot
Copy link
Collaborator Author

From @lydell on 2016-10-03 06:13

Either is fine but I'd personally lean towards foo /\= a/g.

That doesn't make sense :) The first one is division, the second one is a function call with a regex as the argument. (Look at the syntax highlighting!)

@coffeescriptbot
Copy link
Collaborator Author

From @JimPanic on 2016-10-03 06:16

Hm? Either alone would remove the ambiguity, no? Or am I missing something?

Ok, I seem to be missing something: the compiler would have to guess "what you mean". I wonder if that'd be possible taking variable names into account.

This is a nasty ambiguity. 🙈

@coffeescriptbot
Copy link
Collaborator Author

From @lydell on 2016-10-03 06:49

Actually, my own example doesn't make sense. How would the compiler know not to throw an error for foo /= (a/g)?

I guess the crux here is to find what is valid /= division.


This problem exists because when CoffeeScript was first designed, nobody thought about that using regular /.../gimy syntax for regex and having the /= operator and having parentheses-less calls would cause ambiguous grammar. The "real" solution would rather be to choose another syntax for regex, but I guess that's not gonna happen :)

@coffeescriptbot
Copy link
Collaborator Author

From @JimPanic on 2016-10-03 07:35

The "real" solution would rather be to choose another syntax for regex, but I guess that's not gonna happen

True.

Lets just make divisions in division assignments illegal… :P

@coffeescriptbot
Copy link
Collaborator Author

From @mitar on 2016-12-11 19:50

What about:

console.log /= b /g  # regex
console.log /= b / g # division
console.log /= b / # regex

@coffeescriptbot
Copy link
Collaborator Author

From @GeoffreyBooth on 2017-11-26 02:38

Clearly this didn’t get changed for CoffeeScript 2. I’m not sure what the consensus is here, if we want to change things at all.

It seems to me that perhaps one thing we can all agree on is that there should be a Coffeelint rule for /=, to catch the common errors described above without necessarily making code uncompilable (since you can turn the rule on or off, unlike a compiler error). But that can be discussed as an issue in that repo.

Aside from that, we can certainly add a note in the docs; but I’m not sure what to do beyond that. The compiler can’t throw warnings; it either compiles happily or it throws an error and stops dead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant