Support semantic tokens protocol #1721

sebastiansturm · 2020-05-24T13:27:11Z

this PR adds support for the semantic tokens protocol as defined by LSP 3.16, see #1482.

I've been using commit 9c03646... for about 4 weeks with a trunk build of clangd 11. Today's commits have only received superficial testing but should be relatively unlikely to break things.

The following points have not been addressed yet:

I have assumed that there is at most one language server per buffer to provide semantic tokens. Adding support for multiple providers shouldn't require much effort but also doesn't make much sense if there is no use case. If you know of one, please let me know so I can use that to test an extended implementation.
This PR no longer uses overlays for semantic highlighting (at least not when the :semantic-tokens option is used) but instead suspends regular font-locking, waits for semantic tokens to arrive from the language server, afterwards applies regular font locking and then applies semantic tokens on top of that. If the server is slow to respond, this will delay initial fontification, so if people think this is annoying the first fontification run could be performed immediately. Also, if semantic highlighting faces overlap with regular font-locking, semantic highlighting will take precedence; this could be changed to a true layering approach by replacing put-text-property with add-text-properties. Finally, some language servers might provide sufficiently rich highlighting data to completely replace the font-locking mechanism provided by the underlying major mode. If so, I could add an option to do that.
Unfortunately, clangd doesn't support range-based incremental highlighting (yet?). It does support semantic token edits though I'm not sure how much of a performance win that will be. I plan to implement it anyway and run some benchmarks, though I'd rather do range-based highlighting first. Any recommendations for a language server with robust range-based semanticTokens support?

clangd returns tokenModifiers = [], so I'll need another language server to test this against. There's a new defcustom to enable/disable modifier highlighting that defaults to disabled for the time being

yyoncho · 2020-05-24T13:50:08Z

3. Any recommendations for a language server with robust range-based semanticTokens support?

I guess rust analyzer might support it, @flodiebold?

@sebastiansturm

You have several unused variables which cause CI failure: https://travis-ci.org/github/emacs-lsp/lsp-mode/jobs/690604604#L1473

yyoncho

Looks good to me.

I would propose merging the PR as it is with semantics disabled. Once it gets
enough testing I propose having it enabled by default and deleting the rest of
the highlighting related code which apparently won't go into the spec.

Up to now the following servers support semantic highlighting.

Rust Analyzer - latest version
clangd - latest version build from master branch
Dart (?) cc @ericdallo

I propose also adding some visual indication that the syntax highlighting is
about to be applied when we are waiting for the server to respond otherwise it
looks a bit broken. If we agree on that, this could be addressed in a separate PR.

PS: AFAICS Rust Analyzer supports range provider

 |-[-] semanticTokensProvider:
 |  |-[X] rangeProvider: t

yyoncho · 2020-05-24T17:47:57Z

lsp-mode.el

+(defvar-local lsp--semantic-tokens-teardown nil)
+
+(defun lsp--semantic-tokens-initialize-buffer ()
+  (message "semantic-tokens-initialize-buffer")


lsp-mode.el

tschuett · 2020-05-24T18:38:58Z

Would you mind adding instructions to the README.md on how to select the different modes? Thanks.

ericdallo · 2020-05-24T19:18:50Z

Currently, Dart SDK doesn't have support for semantic tokens protocol yet, but since last year they are migrating for their own server protocol to LSP, I hope they implement this soon, so I opened an issue asking for that support.

sebastiansturm · 2020-05-24T22:04:11Z

Would you mind adding instructions to the README.md on how to select the different modes? Thanks.

would it be enough to mention there's a (documented) option called lsp-semantic-highlighting? I wouldn't want to bog down the otherwise brief overview section with too much text.

sebastiansturm · 2020-05-24T22:09:17Z

Looks good to me.

I would propose merging the PR as it is with semantics disabled. Once it gets
enough testing I propose having it enabled by default and deleting the rest of
the highlighting related code which apparently won't go into the spec.

the sooner we can get rid of the old highlighting code, the better. Though I'd wait at least until clangd 11 is released, as clangd 10 only supports the Theia protocol

Up to now the following servers support semantic highlighting.
1. Rust Analyzer - latest version

2. clangd - latest version build from master branch

3. Dart (?) cc @ericdallo
I propose also adding some visual indication that the syntax highlighting is
about to be applied when we are waiting for the server to respond otherwise it
looks a bit broken. If we agree on that, this could be addressed in a separate PR.

good idea. I think with clangd the delay is short enough to render this unnecessary, but other language servers may not be as quick

PS: AFAICS Rust Analyzer supports range provider
 |-[-] semanticTokensProvider:
 |  |-[X] rangeProvider: t

nice. I'll clone something written in Rust then and try to set up rust-analyzer

tschuett · 2020-05-25T04:33:22Z

No worries.

yyoncho · 2020-05-25T10:38:39Z

but other language servers may not be as quick

From what I saw with rust analyzer initially it may take a lot of time because even that I was using a hello world application to test the delay was visible.

nsemrau

Note about a twice defined defface.

nsemrau · 2020-05-29T16:14:18Z

lsp-mode.el

+  "Face used for types."
+  :group 'lsp-faces)
+
+(defface lsp-face-semhl-type


Was it intentional to define this defface twice, identical to to one just above it?

no, wasn't intentional, thanks for the pointer. I assume this doesn't break anything (or does it?), so I'll just fix this as part of the incremental highlighting PR (have rust-analyzer up and running and incremental highlighting seems to work, but it currently doesn't update correctly under all circumstances; plan to work on it on Sunday evening or Monday)

dgutov · 2020-06-01T14:05:47Z

Have you tried using the font-lock-face text property? It might make the font-lock removal/re-application dance unnecessary.

sebastiansturm · 2020-06-02T19:09:23Z

Have you tried using the font-lock-face text property? It might make the font-lock removal/re-application dance unnecessary.

no, so far I haven't. Didn't even know about it, so thanks for the hint! As far as I understand, I could just apply font-lock-face properties every time a new token set arrives (presumably after cleaning existing font-lock-face properties within the region to be fontified?) so font-lock would pick up those faces automatically, and otherwise leave font locking as it is? If so, I do like that approach for being much cleaner than what I have now.
On the other hand though, it seems to me I could then no longer suspend refontification until the new tokens have arrived, in which case fontification would be performed twice (once with regular font locking + old tokens, then with regular font locking + new tokens). Is that true, or did I miss something?
Also, I wonder how different nonstandard fontification mechanisms would work together if we were to use font-lock-face. If another mechanism wanted to provide fontification on top of regular font-locking, how would that other mechanism be chained together with lsp's semantic tokens? (Though even if there was no convenient way to do that with font-lock-face, I can only think of treesitter as an alternative mechanism and I assume treesitter would just replace regular font locking wholesale, so that would be a nonissue. Just wanted to bring it up to make sure I'm not missing something here)

dgutov · 2020-06-02T19:40:56Z

(presumably after cleaning existing font-lock-face properties within the region to be fontified?)

Yup. You can follow js2-mode's example: https://github.com/mooz/js2-mode/blob/master/js2-mode.el#L6641-L6657. Its highlighting doesn't use font-lock either.

On the other hand though, it seems to me I could then no longer suspend refontification until the new tokens have arrived, in which case fontification would be performed twice (once with regular font locking + old tokens, then with regular font locking + new tokens).

I suppose it is true. On the other hand, you only need to clear existing highlighting when you receive the response from the server. So this approach can work without blinking, or similar visual aggravations.

Also, I wonder how different nonstandard fontification mechanisms would work together if we were to use font-lock-face.

Multiple non-standard ones? I don't know (though there are some things you could try). But your current approach doesn't offer any solutions for that either, does it?

OTOH, the suggested approach is very compatible with whitespace-mode, for instance.

sebastiansturm · 2020-06-02T20:11:28Z

(presumably after cleaning existing font-lock-face properties within the region to be fontified?)

Yup. You can follow js2-mode's example: https://github.com/mooz/js2-mode/blob/master/js2-mode.el#L6641-L6657. Its highlighting doesn't use font-lock either.

thank you, I'll have a look at it. Will put this point (face vs font-lock-face) up for discussion in the ranged semantic tokens PR

On the other hand though, it seems to me I could then no longer suspend refontification until the new tokens have arrived, in which case fontification would be performed twice (once with regular font locking + old tokens, then with regular font locking + new tokens).

I suppose it is true. On the other hand, you only need to clear existing highlighting when you receive the response from the server. So this approach can work without blinking, or similar visual aggravations.

yes, I'm just a bit worried about the runtime cost though I haven't done any fontification benchmarks. Should do that, then we can decide whether fontifying twice incurs a relevant performance penalty or not

Also, I wonder how different nonstandard fontification mechanisms would work together if we were to use font-lock-face.

Multiple non-standard ones? I don't know (though there are some things you could try). But your current approach doesn't offer any solutions for that either, does it?

currently, I'm calling font-lock-fontify-region-function followed by lsp-provided semantic highlighting (or at least that is the plan, I haven't tested this together with a nondefault value of font-lock-fontify-region-function). So I'd expect that if there was another mechanism that defined font-lock-face properties, or if there were one or several other mechanisms that similarly patched up font-lock-fontify-region-function, face properties would still be layered correctly?

OTOH, the suggested approach is very compatible with whitespace-mode, for instance.

does the current implementation of semantic highlighting interfere with whitespace-mode in some way? If so, I didn't know; that would of course be a severe shortcoming

dgutov · 2020-06-02T20:21:19Z

worried about the runtime cost though I haven't done any fontification benchmarks

I suppose that could be some concern. Though you probably mean double redisplay. Off the top of my head, I'm not sure which option is going to be faster, but suspect that mine will.

So I'd expect that if there was another mechanism that defined font-lock-face properties, or if there were one or several other mechanisms that similarly patched up font-lock-fontify-region-function, face properties would still be layered correctly?

The former -- probably yes. The latter - I don't know. It's hard to predict the effect of such layering of fontification functions.

does the current implementation of semantic highlighting interfere with whitespace-mode in some way? If so, I didn't know; that would of course be a severe shortcoming

I don't know. whitespace-mode was just a benchmark of the sanity of my approach.

sebastiansturm · 2020-06-03T20:14:40Z

worried about the runtime cost though I haven't done any fontification benchmarks

I suppose that could be some concern. Though you probably mean double redisplay. Off the top of my head, I'm not sure which option is going to be faster, but suspect that mine will.

interesting. Not sure why it would be faster, but if it is, I'll gladly make the change. Will set up some benchmarks on the weekend

sebastiansturm added 10 commits May 24, 2020 14:22

[WIP] Add rudimentary semanticTokens support

3a2cd7d

Use buffer-local advice

ff8e42b

Don't request semantic tokens if unsupported

48f8e30

Use lsp-feature? to detect semanticTokens support

6e9ba82

Prevent out-of-bounds access

9c03646

Update semantic highlighting docstrings

1e7dc2f

[untested] Add support for token modifiers

bae93fa

clangd returns tokenModifiers = [], so I'll need another language server to test this against. There's a new defcustom to enable/disable modifier highlighting that defaults to disabled for the time being

Add some token types defined by clangd

eaf1e35

Mention semantic tokens support in readme

91b0a5e

Remove leftover `append'

c392d05

Remove unused variables

91e2cf6

yyoncho approved these changes May 24, 2020

View reviewed changes

ericdallo mentioned this pull request May 24, 2020

[Analysis Server] Support for LSP semantic tokens dart-lang/sdk#42036

Closed

Fix several issues raised by @yyoncho

01f9b5b

yyoncho merged commit 63105c7 into emacs-lsp:master May 29, 2020

nsemrau reviewed May 29, 2020

View reviewed changes

sebastiansturm mentioned this pull request Jun 2, 2020

Use ranged semanticToken requests when available #1744

Merged

matklad mentioned this pull request Apr 8, 2022

epoch_manager: refactor get_and_update_epoch_info_aggregator near/nearcore#6531

Merged

Uh oh!

Support semantic tokens protocol #1721

Support semantic tokens protocol #1721

Uh oh!

Conversation

sebastiansturm commented May 24, 2020

Uh oh!

yyoncho commented May 24, 2020

Uh oh!

yyoncho left a comment

Choose a reason for hiding this comment

Uh oh!

yyoncho May 24, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

tschuett commented May 24, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ericdallo commented May 24, 2020

Uh oh!

sebastiansturm commented May 24, 2020

Uh oh!

sebastiansturm commented May 24, 2020

Uh oh!

tschuett commented May 25, 2020

Uh oh!

yyoncho commented May 25, 2020

Uh oh!

nsemrau left a comment

Choose a reason for hiding this comment

Uh oh!

nsemrau May 29, 2020

Choose a reason for hiding this comment

Uh oh!

sebastiansturm May 29, 2020

Choose a reason for hiding this comment

Uh oh!

dgutov commented Jun 1, 2020

Uh oh!

sebastiansturm commented Jun 2, 2020

Uh oh!

dgutov commented Jun 2, 2020

Uh oh!

sebastiansturm commented Jun 2, 2020

Uh oh!

dgutov commented Jun 2, 2020

Uh oh!

sebastiansturm commented Jun 3, 2020

Uh oh!

Uh oh!

tschuett commented May 24, 2020 •

edited

Loading