Root AST #5137

helixbass · 2018-11-21T01:09:44Z

@GeoffreyBooth PR for root Block AST

Based on jsx-element-ast, here is the diff against that branch

src/nodes.coffee

helixbass · 2018-11-21T01:15:14Z

src/nodes.coffee

+  rootToAst: ->
+    programLocationData = @astLocationData()
+
+    programAst = Object.assign


The root of a Babel-style AST has a root File node with a Program child node

Rather than a Block having two “modes,” root and body, should we perhaps split this class up into a base and children? Like BlockBase as a parent for Root and Block?

I also wonder if file.program should be handled outside of the node classes, since that’s more like part of the file itself than part of the AST? We could return a root with expressions like we were before, or perhaps rename expressions to body in the AST, and then the parent file.program could be added outside of this like in coffeescript.coffee or somewhere else.

Rather than a Block having two “modes,” root and body, should we perhaps split this class up into a base and children? Like BlockBase as a parent for Root and Block?

Ok introduced a new Root class, but made it a parent of the root Block rather than a subclass variation of it

I also wonder if file.program should be handled outside of the node classes, since that’s more like part of the file itself than part of the AST?

Handling the root File AST node outside of the node classes makes a certain amount of sense conceptually but in practice it seems like it should stay in the node classes since (a) that File node needs to have comments attached to it and (b) it seems weird not to be able to use our AST methods to generate that root AST node

So I think this makes the most sense - now we can actually use the standard AST methods for both the File AST node (which corresponds to the Root node class) and the Program AST node (which corresponds to the root block, ie the body property of the Root node class instance), so from an AST-generation perspective it's a significant cleanup

This also cleans up the root case when compiling JS a bit - now Root::compileNode() does most of what Block::compileRoot() did previously, and Block::compileToFragments() is gone (since Root::compileNode() can directly call the still-special Block::compileRoot() on its @body since it already knows that it's the root block)

The interesting thing I ran into was that producing/expecting this new Root node was breaking the existing interfaces of CoffeeScript.nodes() and the corresponding call to .compile() (on the root node returned by CoffeeScript.nodes())

For .compile(), I added a backwards-compatibility override Block::compile() which checks if it needs to wrap itself in a new Root. So that way any existing code that expects to be able to call .compile() on a "root" Block instance will still work

For CoffeeScript.nodes() it was slightly more interesting. We could just return the new Root instance (as the root of the node class instance tree), and then clean up internal references (eg in repl.coffee, a few tests) that expect the return value of CoffeeScript.nodes() to be a Block instance

But then any third-party consumers of the CoffeeScript.nodes() API that expect it to return a Block would break. So what I did was return the "root" Block instance (ie the body of the Root instance) from CoffeeScript.nodes(). This is a little weird in that it makes the Root instance impossible to obtain a reference to via CoffeeScript.nodes(), but I think it's preferable to introducing breakage for consumers of CoffeeScript.nodes()?

src/nodes.coffee

helixbass · 2018-11-21T01:20:50Z

src/nodes.coffee

+      type: 'Program'
+      sourceType: 'module'
+      body: @bodyToAst()
+      directives: []


For now, don't look for directives (eg 'use strict')

helixbass · 2018-11-21T01:21:08Z

src/nodes.coffee

+    Object.assign
+      type: 'File'
+      program: programAst
+      comments: []


For now, don't worry about comments

Comments are going to be a nightmare.

They are and they aren't:

We only need to provide a list of all comments here on the root File node (not individual comments attached to their corresponding AST nodes), so we don't have to worry about that

But there were a fair number of little things that needed tweaking to get comments to behave correctly/as expected for Prettier and ESLint (eg location data, more awareness of comments when placing generated tokens)

I'm planning on comments being the last big thing we cover in these PRs (after we've gone through all the node classes)

helixbass · 2018-11-21T01:22:31Z

src/nodes.coffee

+      programLocationData
+
+  ast: ->
+    @rootToAst()


For now act as though all Blocks are root blocks. Will introduce nested blocks next

GeoffreyBooth · 2018-11-24T07:13:24Z

src/nodes.coffee

+  rootToAst: ->
+    programLocationData = @astLocationData()
+
+    programAst = Object.assign


Rather than a Block having two “modes,” root and body, should we perhaps split this class up into a base and children? Like BlockBase as a parent for Root and Block?

src/nodes.coffee

helixbass

@GeoffreyBooth updated to add Root node class, see comments

helixbass · 2018-11-26T17:10:40Z

src/nodes.coffee

+  rootToAst: ->
+    programLocationData = @astLocationData()
+
+    programAst = Object.assign


Rather than a Block having two “modes,” root and body, should we perhaps split this class up into a base and children? Like BlockBase as a parent for Root and Block?

Ok introduced a new Root class, but made it a parent of the root Block rather than a subclass variation of it

I also wonder if file.program should be handled outside of the node classes, since that’s more like part of the file itself than part of the AST?

Handling the root File AST node outside of the node classes makes a certain amount of sense conceptually but in practice it seems like it should stay in the node classes since (a) that File node needs to have comments attached to it and (b) it seems weird not to be able to use our AST methods to generate that root AST node

So I think this makes the most sense - now we can actually use the standard AST methods for both the File AST node (which corresponds to the Root node class) and the Program AST node (which corresponds to the root block, ie the body property of the Root node class instance), so from an AST-generation perspective it's a significant cleanup

This also cleans up the root case when compiling JS a bit - now Root::compileNode() does most of what Block::compileRoot() did previously, and Block::compileToFragments() is gone (since Root::compileNode() can directly call the still-special Block::compileRoot() on its @body since it already knows that it's the root block)

The interesting thing I ran into was that producing/expecting this new Root node was breaking the existing interfaces of CoffeeScript.nodes() and the corresponding call to .compile() (on the root node returned by CoffeeScript.nodes())

For .compile(), I added a backwards-compatibility override Block::compile() which checks if it needs to wrap itself in a new Root. So that way any existing code that expects to be able to call .compile() on a "root" Block instance will still work

For CoffeeScript.nodes() it was slightly more interesting. We could just return the new Root instance (as the root of the node class instance tree), and then clean up internal references (eg in repl.coffee, a few tests) that expect the return value of CoffeeScript.nodes() to be a Block instance

But then any third-party consumers of the CoffeeScript.nodes() API that expect it to return a Block would break. So what I did was return the "root" Block instance (ie the body of the Root instance) from CoffeeScript.nodes(). This is a little weird in that it makes the Root instance impossible to obtain a reference to via CoffeeScript.nodes(), but I think it's preferable to introducing breakage for consumers of CoffeeScript.nodes()?

src/nodes.coffee

helixbass · 2018-11-26T17:41:22Z

src/nodes.coffee

+    Object.assign
+      type: 'File'
+      program: programAst
+      comments: []


They are and they aren't:

We only need to provide a list of all comments here on the root File node (not individual comments attached to their corresponding AST nodes), so we don't have to worry about that

But there were a fair number of little things that needed tweaking to get comments to behave correctly/as expected for Prettier and ESLint (eg location data, more awareness of comments when placing generated tokens)

I'm planning on comments being the last big thing we cover in these PRs (after we've gone through all the node classes)

src/nodes.coffee

src/coffeescript.coffee

src/nodes.coffee

GeoffreyBooth · 2018-11-30T05:47:22Z

verify my assumption that it’ll be easy to “monkeypatch” it back in as needed inside the ESLint plugin, since there are places there that expect it to be present

I would think you could just add it as a property in your middleware plugin, like (psuedocode):

ast = CoffeeScript.compile(source, ast: yes)
ast.program.sourceType = 'module'
# use `ast` with ESLint

src/nodes.coffee

helixbass

@GeoffreyBooth updated per your comments

I would think you could just add it as a property in your middleware plugin

Yup the ESLint plugin already has a transformation step (to go from Babel-style AST to ESLint/espree-style AST, like what babel-eslint does), so this would fit naturally there

src/nodes.coffee

… location data, rather than extracting it from the whole AST object; move all the logic into one function, rather than spreading it out across several functions on the Block class that all appear to be internal

GeoffreyBooth · 2018-12-03T09:11:42Z

@helixbass I pushed some changes, please review in particular 3677d11

I refactored Block.astProperties to use expression.astLocationData() to get location data, rather than extracting the location data from the whole AST object; this felt more object-oriented and logical to me. I also moved the logic for generating astProperties into that method, rather than spreading it out across several methods on the Block class that all appear to be internal (as in it wouldn’t make sense to call those methods from outside of Block, they’re really just helper functions for astProperties). All the tests still pass, please let me know what you think.

One other thing is that we’re not defining Block.astLocationData, but it appears we don’t need to, the version it inherits from Base seems to work fine. We’re not testing the location data for the Program node, though; not sure if we need to.

helixbass · 2018-12-03T17:34:40Z

@GeoffreyBooth the changes you made look fine including the refactor

You're right there currently aren't AST location data tests for the root File/Program. I just was messing with this and ran into some issues related to whether the location data of a trailing TERMINATOR (generated or not) gets include in the File/Program AST location data. I think this deserves a little deeper examination so I'd like to treat this as a separate location-data-related TODO

GeoffreyBooth · 2018-12-03T18:15:43Z

I noticed exactly the same thing, about a newline at the end of the file, when I was comparing our AST for something trivial like new Date() with the AST from astexplorer.net. Might be worth nailing down which way it should be and adding a test for that.

helixbass · 2018-12-03T18:43:29Z

@GeoffreyBooth ok I pushed a commit fixing some of the behavior related to root location data and with a commented-out TODO test for a part that still isn't behaving correctly

The thing that's still not behaving correctly is the case when there's an actual trailing newline in the source file. From what I can tell the root of the issue is that in outdentToken() in lexer.coffee (called by closeIndentation()) it always creates the TERMINATOR token with length: 0 (regardless of whether it corresponds to an actual newline and thus should be length: 1)

But I don't feel confident trying to apply an immediate fix there, so I left a commented-out AST location data test case with an actual trailing newline which should pass once we apply a fix such that the location data of the trailing TERMINATOR token is always correct

GeoffreyBooth · 2018-12-07T06:26:02Z

Which parser are you treating as your target in astexplorer.net? babylon7?

I’m thinking about the case where the code ends with multiple newlines, or a comment:

a = 1


# copyright 2018

In astexplorer, everything all the way to the last character in the comment is included in the root location end. I’m thinking that we may need to handle this as a special case, either in coffeescript.coffee or by noting the original source length from the lexer and passing it as extra data to the Root class.

helixbass · 2018-12-10T16:26:29Z

Which parser are you treating as your target in astexplorer.net? babylon7?

@GeoffreyBooth yes typically babylon7

I’m thinking about the case where the code ends with multiple newlines, or a comment

This is a good special case to check if it handles it correctly. We should wait until we've gotten through comments to address it though, since there will be similar including-comment-locations-in-Block-location-data fixes included there, which would probably affect the handling of this particular special case (where the "root block" location data should include a trailing comment)

helixbass · 2019-01-08T20:37:44Z

@GeoffreyBooth was there something else you wanted to see addressed in this PR?

GeoffreyBooth · 2019-01-15T02:53:58Z

@helixbass Please take a look at helixbass@a85b987, I think I’ve fixed the location data end values for the root/Program»File node. I think the issue was due to the clean function in the lexer, which trims extraneous whitespace before the lexing begins.

helixbass · 2019-01-16T20:39:27Z

@GeoffreyBooth ok the approach you took makes sense to me (handling the root File/Program node location data as a special case where we just reference the original complete source code string directly rather than try and account for destructive operations eg the lexer's clean()), merging

helixbass · 2019-01-16T20:40:16Z

@GeoffreyBooth oh wait I can't merge into ast, so looks good for you to merge 👍

GeoffreyBooth · 2019-01-16T21:10:18Z

Awesome, thanks 😄

helixbass commented Nov 21, 2018

View reviewed changes

GeoffreyBooth reviewed Nov 24, 2018

View reviewed changes

helixbass commented Nov 26, 2018

View reviewed changes

GeoffreyBooth reviewed Nov 28, 2018

View reviewed changes

src/nodes.coffee Show resolved Hide resolved

helixbass added 4 commits November 28, 2018 16:08

root ast

59d2ab3

updated grammar

8a7b4be

preserve CoffeeScript.nodes() API

21ea0ee

root ast methods

fe38658

helixbass force-pushed the root-ast branch from 1d01f65 to fe38658 Compare November 28, 2018 21:11

GeoffreyBooth reviewed Nov 30, 2018

View reviewed changes

src/nodes.coffee Outdated Show resolved Hide resolved

src/nodes.coffee Show resolved Hide resolved

src/nodes.coffee Outdated Show resolved Hide resolved

helixbass added 2 commits December 2, 2018 12:50

updates from code review

7945260

merge ast

6330897

helixbass commented Dec 2, 2018

View reviewed changes

src/nodes.coffee Outdated Show resolved Hide resolved

src/nodes.coffee Outdated Show resolved Hide resolved

src/nodes.coffee Show resolved Hide resolved

GeoffreyBooth added 5 commits December 3, 2018 00:33

Style

a8adedc

Fix a few missing returns

545aeb8

Expand sourceType explanation

eac8a63

Simplify

2d2e98f

Refactor Block.astProperties: use expression.astLocationData() to get…

3677d11

… location data, rather than extracting it from the whole AST object; move all the logic into one function, rather than spreading it out across several functions on the Block class that all appear to be internal

testing root location data

93d9841

Fix location end data for root/File » Program AST node

a85b987

GeoffreyBooth approved these changes Jan 15, 2019

View reviewed changes

GeoffreyBooth merged commit 38c8b2f into jashkenas:ast Jan 16, 2019

GeoffreyBooth mentioned this pull request Dec 26, 2019

Add more explicit sourceType in options and return of compile #5268

Closed

Root AST #5137

Root AST #5137

Uh oh!

Conversation

helixbass commented Nov 21, 2018

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

helixbass left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

GeoffreyBooth commented Nov 30, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

helixbass left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

GeoffreyBooth commented Dec 3, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

helixbass commented Dec 3, 2018

Uh oh!

GeoffreyBooth commented Dec 3, 2018

Uh oh!

helixbass commented Dec 3, 2018

Uh oh!

GeoffreyBooth commented Dec 7, 2018

Uh oh!

helixbass commented Dec 10, 2018

Uh oh!

helixbass commented Jan 8, 2019

Uh oh!

GeoffreyBooth commented Jan 15, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

helixbass commented Jan 16, 2019

Uh oh!

helixbass commented Jan 16, 2019

Uh oh!

GeoffreyBooth commented Jan 16, 2019

Uh oh!

Uh oh!

GeoffreyBooth commented Nov 30, 2018 •

edited

Loading

GeoffreyBooth commented Dec 3, 2018 •

edited

Loading

GeoffreyBooth commented Jan 15, 2019 •

edited

Loading