Add more explicit sourceType in options and return of compile #5268

jkrems · 2019-12-23T20:08:58Z

This allows code passing the output on to other tools to correctly identify the sourceType of the output.

This PR was started by: git wf pr

This allows code passing the output on to other tools to correctly identify the `sourceType` of the output.

jkrems · 2019-12-23T20:11:33Z

src/coffeescript.coffee

@@ -100,12 +100,16 @@ exports.compile = compile = withPrettyErrors (code, options = {}) ->
  )

  # Check for import or export; if found, force bare mode.
-  unless options.bare? and options.bare is yes
+  sourceType = options.sourceType ? 'unambiguous'


This is based on the definition of sourceType in babel. In both babel sourceType: 'unambiguous' and in CoffeeScript, there's behavior based on the existence of import/export statements. In the future this could also fail properly when import is used in scripts but that may fall out of the scope of coffeescript.

FYI the original code here removes the “safety wrapper” (function() { ... }).call(this); when the source code contains import or export, since per spec import and export must be at the top level and therefore such a wrapper would always break ESM code.

Another problem with the IIFE is that .call(this) in a module wouldn't work. So even without import/export it would break when the source type is module. :)

EDIT: "Wouldn't work" as in "doesn't really make sense" because there's no value for this in a module unlike in scripts and CJS files.

jkrems · 2019-12-23T20:16:29Z

For background: I was wishing for this kind of API when working on examples for fetch hooks / service workers. It allows post-processing to be a bit smarter about how much (if any) wrapping the JS generated by the compiler needs (without having to add a 2nd layer of guessing).

GeoffreyBooth · 2019-12-26T22:14:22Z

Hi @jkrems thanks for this!

So it seems like this PR adds a new config option to the Node API? As you might've noticed, the options for CoffeeScript are kept very minimal. In general, adding a new option should have a strong justification; and it should be mirrored in the CLI too unless there's a reason why it wouldn't apply there. I'm not saying that sourceType doesn't meet that bar, but we should give it some consideration before adding it; and we should think about if there might be ways to set it automatically rather than adding it as another input option.

As part of the AST work we've been discussing how to handle sourceType, as it’s generally part of a Babel-compatible AST; for now we were thinking of just leaving it out, and having people add it after the CoffeeScript compiler does its work: #5137 (comment). My thought back then was that this could be the first pass, and once Node's behavior stabilized we could copy Node's algorithm (e.g. get package type). But I'm not sure about this; for now it can be something handled downstream by the plugins consuming the CoffeeScript AST, only by the plugins that need the field set.

So my first thought is, one thing we definitely could do is include sourceType in the Node object output (i.e. the output you get from the Node API when requesting source maps or AST) when the source code contains import or export. That's an unambiguous signal for when we could include sourceType: module. But other than that case, we can't really safely include anything else. Would it be useful at all to have sourceType only be present when it's module?

If not, then we have to decide how to determine when sourceType should be script. We could follow Node's lead and look for a package.json type field; though in the absence of such a field I think I'd rather exclude sourceType rather than set it to script, as the lack of a type field isn't really meaningful in a project where source files are getting transpiled (since Babel could be transpiling ESM CoffeeScript into CommonJS JavaScript). Alternatively we could allow the user to set sourceType as an input option to the CoffeeScript compiler; but if the user is going to do that, how is that any better than something like:

output = CoffeeScript.compile src, {bare: true, sourceMap: true}
output.sourceType = 'module'

jkrems · 2019-12-26T22:39:20Z

My thought back then was that this could be the first pass, and once Node's behavior stabilized we could copy Node's algorithm (e.g. get package type). But I'm not sure about this; for now it can be something handled downstream by the plugins consuming the CoffeeScript AST, only by the plugins that need the field set.

From my use of the Coffeescript API, using the file system to determine parse goal would've had unwanted side effects. E.g. I used to maintain a service that compiled code in-memory that didn't exist on disk and having CS check the disk for the filenames would've been incorrect (if it would've worked at all).

Alternatively we could allow the user to set sourceType as an input option to the CoffeeScript compiler; but if the user is going to do that, how is that any better than something like:

It's better because it allows - in the future - to make it explicit what the expected parse goal is. The same reason that babel/acorn/etc. all added the option instead of sticking with a "guess from AST" approach. :) E.g. it would be nice if CoffeeScript allowed await as a valid identifier in scripts. It's only supposed to be reserved in a module. Or if CoffeeScript failed when import is used outside of a module. Maybe for this PR it should only allow unambiguous since that's the closest to expressing what CoffeeScript supports today..?

So my first thought is, one thing we definitely could do is include sourceType in the Node object output (i.e. the output you get from the Node API when requesting source maps or AST) when the source code contains import or export. That's an unambiguous signal for when we could include sourceType: module. But other than that case, we can't really safely include anything else. Would it be useful at all to have sourceType only be present when it's module?

That would mean that it's impossible to tell apart if the returned code is a script or if something went wrong / the code was used with an older version of CoffeeScript. I definitely considered just adding the sourceType to the output and accepting the overhead of generating the source maps to get to the data. That may be a good enough intermediate outcome and leaves the door open for adding an input option in the future.

GeoffreyBooth · 2019-12-27T23:27:39Z

I think I'm not fully understanding your use case, or what you're trying to achieve. So there seem to be two possibilities (at least) for providing sourceType as an input option:

sourceType can be script or module
sourceType can be script or module or unambiguous, following Babel, where unambiguous is module if import/export are present or script otherwise

The first version seems rather pointless; since the CoffeeScript compiler doesn't do anything with this information other than pass it along as part of the output, it seems like whatever tool you're building that calls/wraps the CoffeeScript compiler could just add the source type as part of its output.

So it seems like what you're asking for is the second version, correct? Which could be achieved today via bare: false (or excluding bare) and looking to see if the wrapper is present, but that's hacky and clunky and brittle. Once the ast branch is merged in, which will probably happen any day now, you could also request the AST of your source code and search it for import or export statements, but that's also time consuming (and you could do the same today by parsing the output JS and searching its AST, since it's the same syntax in both).

So I guess the question is, what's the argument for providing an option to detect import/export statements? I thought as a general trend we were trying to discourage that as a way of determining Script versus Module, as opposed to more explicit signals like package.json type or .mjs (in the JS world). Besides encouraging what some would argue is a bad habit, why should this be part of the CoffeeScript compiler, if the compiler itself isn't going to do anything with this information (or should it?).

jkrems · 2019-12-28T01:06:46Z

The CoffeeScript compiler isn't just used with files on disk. It's also used in-memory. And in those cases checking other files isn't really an option. I agree that right now making the sourceType configurable wouldn't be too valuable since nobody will make CoffeeScript actually check that it's emitting valid JavaScript. But I think there is value in CoffeeScript saying what it thinks it is emitting, e.g. by adding sourceType to the output. This means that the input sourceType is implied to always be unambiguous which seems to match the existing behavior around bare: CoffeeScript considers only code with top-level import/export statements ES modules and thus will force bare when they are present.

E.g. the following is also clearly an ES module but CoffeeScript will generally not consider it ESM (as shown by the wrapper):

# should be detected as ESM but isn't
console.log import.meta

Once the ast branch is merged in, which will probably happen any day now, you could also request the AST of your source code and search it for import or export statements

Would it be possible to attach the sourceType to the AST? I think that would solve this use case for now.

I thought as a general trend we were trying to discourage that as a way of determining Script versus Module

Right, but right now it's what CoffeeScript does. I would hope that eventually CoffeeScript would add proper checks for valid syntax and potentially even allow explicit sourceType configuration. E.g. allow await as an identifier in scripts. Error in scripts that contain import.meta but allow it to appear in modules. Allow top level await but only in modules. These features really would benefit from a clear split between the two JS dialects.

For the CLI, I could imagine that sourceType could be determined automatically from package.json etc. but I don't think that works for the node API.

jkrems · 2019-12-28T01:11:35Z

For additional context: I'm looking at building a service worker to compile coffeescript on the fly. Right now there's only one mime type for coffeescript. So that's the only information I have. I can't check other files realistically, I have only the mime type and the coffee source text.

I can handle coffeescript returning non-module code if I know that it's non-module code so I can wrap it. So I need to have a clear way to communicate to coffeescript: "I got this string, it has the coffeescript mime type, do your thing, and tell me if you'd consider your output to be a module or a script/non-module". Alternatively I could also work with "I can set sourceType to module and at least get a good error message for why it's not a module".

GeoffreyBooth · 2019-12-28T04:40:20Z

And what about when there's JSX in there? https://coffeescript.org/#jsx. That's also valid in plain CoffeeScript source code.

CoffeeScript just doesn't really exist absent its build chain. From the MIME type or file extension you don't know if it's CoffeeScript 1 or 2, for example, and there are some breaking changes between them: https://coffeescript.org/#breaking-changes. You certainly don't know if it's meant to be Script or Module, CommonJS or ESM, JSX or normal JS, etc.

GeoffreyBooth · 2019-12-28T04:46:36Z

CoffeeScript considers only code with top-level import/export statements ES modules and thus will force bare when they are present.

This isn't correct. CoffeeScript forces bare when import or export statements are present specifically because they need to be top-level, and by definition the wrapper would make them not top-level. That's it. CoffeeScript isn't using the syntax as an indicator for source type, and CoffeeScript doesn't behave any differently based on one source type or the other.

All the things you're describing about erroring on certain features in certain modes sound like errors that Babel could throw, and you can pass options to the Babel that can be used as part of CoffeeScript's compilation: use the transpile option, and give it a Babel options object as its input. One of those options is sourceType.

GeoffreyBooth · 2021-11-27T06:31:48Z

I'm thinking that we should close this for age.

For additional context: I’m looking at building a service worker to compile coffeescript on the fly. Right now there’s only one mime type for coffeescript. So that’s the only information I have. I can’t check other files realistically, I have only the mime type and the coffee source text.

This is an interesting use case, and I’m not sure what the best solution would be. My first reaction is that a sourceType field isn’t enough: traditional CoffeeScript could be any flavor of JavaScript, whether classic browser script or Node CommonJS, so you need more context than just script vs module or CommonJS vs module. If you’re still interested in building this, we can discuss your use case further to figure out the best solution.

I assume this is/was more for demo purposes than anything else? Then perhaps just say up front that the service only support ESM JavaScript?

Add more explicit sourceType in options and return of compile

e360d3c

This allows code passing the output on to other tools to correctly identify the `sourceType` of the output.

jkrems commented Dec 23, 2019

View reviewed changes

GeoffreyBooth added the enhancement label Dec 26, 2019

GeoffreyBooth mentioned this pull request Jan 9, 2020

ES modules in Node, and .mjs files #5018

Open

GeoffreyBooth force-pushed the master branch from 483c9d3 to ed6733d Compare October 5, 2021 03:50

GeoffreyBooth closed this Nov 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add more explicit sourceType in options and return of compile #5268

Add more explicit sourceType in options and return of compile #5268

jkrems commented Dec 23, 2019

jkrems Dec 23, 2019

GeoffreyBooth Dec 26, 2019

jkrems Dec 27, 2019 •

edited

Loading

jkrems commented Dec 23, 2019

GeoffreyBooth commented Dec 26, 2019

jkrems commented Dec 26, 2019

GeoffreyBooth commented Dec 27, 2019

jkrems commented Dec 28, 2019

jkrems commented Dec 28, 2019

GeoffreyBooth commented Dec 28, 2019

GeoffreyBooth commented Dec 28, 2019

GeoffreyBooth commented Nov 27, 2021 •

edited

Loading

Add more explicit sourceType in options and return of compile #5268

Add more explicit sourceType in options and return of compile #5268

Conversation

jkrems commented Dec 23, 2019

jkrems Dec 23, 2019

Choose a reason for hiding this comment

GeoffreyBooth Dec 26, 2019

Choose a reason for hiding this comment

jkrems Dec 27, 2019 • edited Loading

Choose a reason for hiding this comment

jkrems commented Dec 23, 2019

GeoffreyBooth commented Dec 26, 2019

jkrems commented Dec 26, 2019

GeoffreyBooth commented Dec 27, 2019

jkrems commented Dec 28, 2019

jkrems commented Dec 28, 2019

GeoffreyBooth commented Dec 28, 2019

GeoffreyBooth commented Dec 28, 2019

GeoffreyBooth commented Nov 27, 2021 • edited Loading

jkrems Dec 27, 2019 •

edited

Loading

GeoffreyBooth commented Nov 27, 2021 •

edited

Loading