Skip to content

fix(host): invalidate cache less to speed up compilation for large projects #476

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

std4453
Copy link

@std4453 std4453 commented May 7, 2025

Summary

Fixes a few problems related to host.ts and speed up compilation for large projects.

Details

The current version of rollup-plugin-typescript2 causes the TypeScript cache to get constantly invalidated, for each entry file during a single rollup build, causing build time to increase non-linearly. For for fairly complicated projects, this easily bloat into minutes (about 400s for our project with ~50 files and ~15000 locs).

I've identified and fixed these problems that caused this:

  • In setSnapshot, file version is increased unnecessarily when source hasn't changed.
  • Files in node_modules are added incorrectly into LanguageServiceHost.fileNames by setSnapshot, since the function is called indirectly for all dependency files discovered while type checking. fileNames should be the list of entry files for this build, there's no point updating it during the build process.
  • In getScriptSnapshot, also create snapshot when file is empty, otherwise it would be marked as 'missing' and invalidate the cache.

I've verified that the updated code can compile itself and our company project, making absolutely no changes to the output. The new compile time is 14s compared to 400s.

You can override rollup-plugin-typescript2 with @std4453/rollup-plugin-typescript2 to try it out, which is forked from version 0.36.1.

@agilgur5 agilgur5 added topic: TS Compiler API Docs Related to the severely lacking TS Compiler API Docs kind: optimization Performance, space, size, etc improvement labels May 7, 2025
@agilgur5 agilgur5 changed the title Speed up compilation for large projects fix(host): invalidate cache less to speed up compilation for large projects May 7, 2025
@std4453
Copy link
Author

std4453 commented May 8, 2025

@agilgur5 I think that CI is failing because the latest version of actions/setup-node is now v4.

@ezolenko
Copy link
Owner

ezolenko commented May 8, 2025

@std4453 yeah, I fixed it in master yesterday, if you merge CI should work

@agilgur5
Copy link
Collaborator

agilgur5 commented May 8, 2025

Yes you should be able to rebase on top of #477 to get CI to run properly.

I haven't fully reviewed this, but it might be good to add regression tests for these changes. Also some tests may need updating to account for these changes

Copy link
Collaborator

@agilgur5 agilgur5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My memory is pretty foggy with this codebase right now, especially with regard to the sparsely documented TS Compiler internals, but I left a few comments in-line

src/host.ts Outdated
Comment on lines 34 to 38
const originalSnapshot = this.snapshots[fileName];

if (originalSnapshot && originalSnapshot.getText(0, originalSnapshot.getLength()) === source) {
return originalSnapshot;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In setSnapshot, file version is increased unnecessarily when source hasn't changed.

This more or less makes sense to me. This does assume that this check isn't already done elsewhere though (e.g. the Language Service itself or Rollup, etc). Have you checked if this specific change in isolation makes a performance reduction? If the check is duplicative, then this actually would decrease performance as a redundant check

The only reference to version I can find right now is in this section of the Compiler API docs. I feel like there were more examples I referenced before but I can't remember right now

Also some stylistic changes would be helpful here for consistency:

Suggested change
const originalSnapshot = this.snapshots[fileName];
if (originalSnapshot && originalSnapshot.getText(0, originalSnapshot.getLength()) === source) {
return originalSnapshot;
}
// don't update the snapshot if there are no changes
const prevSnapshot = this.snapshots[fileName];
if (prevSnapshot?.getText(0, prevSnapshot.getLength()) === source)
return prevSnapshot;
  • add a comment to explain this
  • "previous" is clearer than "original"; to disambiguate between "previous" and "next" basically
  • consistent formatting to the rest of the codebase with the if statement
  • can use optional chaining (?.) to simplify the condition

Copy link
Author

@std4453 std4453 May 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What TypeScript does is that it first calls getScriptVersion to check if anything has changed, then getScriptSnapshot to get the new snapshot. In a typical implementation, change of version should happen before building new snapshot. It should actually happen before calling getEmitOutput, which exactly what the official example does.

Here I honestly don't understand why version is updated inside getSnapshot, however in order to not break anything, I compared the current snapshot to the requested source. This should be okay because ScriptSnapshot.fromString is simply creating a StringScriptSnapshot object, which holds no state other than source, and never writes to it, so it should behave identically in every way to the previous snapshot.

In a typical implementation where hosts listen to FS events for file updates, there's usually only one changed file, however that information isn't available to TypeScript, so it needs to check every imported file for changes. Reading files from slow filesystems might take hundreds of milliseconds, so TypeScript opens up an optimizing opportunity to the host by separating change check into two phases, first getScriptVersion, then getScriptSnapshot.

Originally, the TypeScript cache gets constantly invalidated, causing build time to increase non-linearly, bloating into minutes for fairly complicated projects.

This commit identified and fixed these problems that caused this problem:

- File version is increased unnecessarily when source hasn't changed.
- Files in node_modules are added incorrectly into LanguageServiceHost.fileNames, since `setSnapshot` is called indirectly for all dependency files discovered when type checking. `fileNames` represents the entry files for this build, so it shouldn't be updating during the process.
- Also creating snapshot when file read in `getScriptSnapshot` is empty (which is still a valid file whatsoever), otherwise it would be marked as 'missing' and invalidate the cache.
@std4453
Copy link
Author

std4453 commented May 9, 2025

@ezolenko I've rebased my branch to include the CI fix, look like it needs approval from maintainers to run.

@agilgur5
Copy link
Collaborator

agilgur5 commented May 9, 2025

I approved it before (when it failed due to old actions/setup-node@v2), but apparently we need to do so again after the rebase.
It has ran now and errors out as expected on the host tests, which would need adjusting.

@std4453
Copy link
Author

std4453 commented May 12, 2025

OK I think I've resolved all the code comments, however I couldn't get the watch CI spec working, even locally. Weirdly though, even if I revert the code I've edited, the test would still fail. Any idea on why that might happen? @agilgur5

@std4453
Copy link
Author

std4453 commented May 14, 2025

I've done more digging and here's what I found:

The remaining test doesn't work because TS does not view the file as changed, therefore it won't recompile the file. This is because servicesHost is not preserved across builds under watch mode, so versions always starts with undefined.

Under watch mode when a file was updated, rollup first calls buildStart and then transform. In buildStart, rpt2 calls service.getCompilerOptionsDiagnostics, which would tell TS to sync host versions, and it invokes getScriptSnapshot for all related files, effectively setting all their versions to 1. Then, in transform, rpt2 calls setSnapshot, which originally bumps version of the updated file to 2.

However, since we now check whether source content has changed in setSnapshot, the invocation in transform simply does nothing, leaving the version of the file with 1. This means that in every build under watch mode, the version of every file stays at 1, so TypeScript never sees the need to fetch new file snapshots from host, and recompiles no files at all.

This still wouldn't be a problem if TypeScript always create its initial snapshots with latest sources, however it actually read from documentRegistry if file version is not changed, and #388 makes documentRegistry get preserved across builds, and it never get updated if getScriptVersion stays at 1.

(p.s., if the updated file get updated again, it looks like the script versions would still be 1 for all other files and 2 for the updated file. This actually isn't a problem because service.getCompilerOptionsDiagnostics was called before setSnapshot, so the updated file would still have version 1, and TypeScript think that its version has changed and reloads the file. Eventually it gets reloaded again in getEmitOutput, which can be optimized but maybe next time.)

To avoid further tampering with existent logic and breaking things unexpected, I added a third argument forceUpdate to setSnapshot, which forces the bumping of version even when source is not changed. It is only passed true in transform under watch mode, so that wouldn't affect any other branch.

Under watch mode, this effectively restores the original behavior of always incrementing version in setSnapshot, which can be verified since there's only two places calling setSnapshot, and getScriptSnapshot wouldn't even call setSnapshot if there already is a snapshot for the file requested. Under build mode, source file would not change during the build process, so there's no need to update version whatsoever.

After this, all tests should be passing now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind: optimization Performance, space, size, etc improvement topic: TS Compiler API Docs Related to the severely lacking TS Compiler API Docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants