Skip to content

Commit 092f059

Browse files
dannyvvarcanis
andauthored
Use RefCountedCache for ZipFs instances in Cache to avoid duplicate allocations (#6723)
## What's the problem this PR addresses? In pnpm mode we are seeing ZipFS instances created multiple times for the same package file in the cache. On large projects this easily leads to Out Of Memory Errors. This is because the 'Link' phase runs worker threads and the LazyFs instance creates ZipFS on demand. Due to multipel threads accessing the FileSystem asyncroneously in the Link phase, threading issues causing the factory inside LazyFs to be called and ZipFS instances to be created multiple times. We have a bunch of large internal packages (between 100Mb and 200Mb) that are referenced by many packages causing sometimes up to 6 instances of ZipFS for a single package. This makes the WebAssembly instance to run out of memory very fast :) #6722 Has more details on the conclusion and a discussion of approachers Resolves #6722 ... ## How did you fix it? I created a `RefCountedCache` class that has an `addOrCreate` function that takes a key. (In our use case we pass the zip's cached path as the key) and a function to create create the `ZipFS` when needed. internally that class uses reference counting for handles and will return the cached object when there is already one live. It will also return a release function that will decrease the refcount and will call the destructor passed into the constructor of the `RefCountedCache`. In this case the destructor called will be `ZipFS.discardAndClose` In the out of range cases the addOrCreate and the release call will throw an error. I provided a unittest to cover the basic RefCountedCache implementation and relied on integration tests and local testing in our repo for validation. ... ## Checklist <!--- Don't worry if you miss something, chores are automatically tested. --> <!--- This checklist exists to help you remember doing the chores when you submit a PR. --> <!--- Put an `x` in all the boxes that apply. --> - [x] I have read the [Contributing Guide](https://yarnpkg.com/advanced/contributing). <!-- See https://yarnpkg.com/advanced/contributing#preparing-your-pr-to-be-released for more details. --> <!-- Check with `yarn version check` and fix with `yarn version check -i` --> - [X] I have set the packages that need to be released for my changes to be effective. <!-- The "Testing chores" workflow validates that your PR follows our guidelines. --> <!-- If it doesn't pass, click on it to see details as to what your PR might be missing. --> - [x] I will check that all automated PR checks pass before the PR gets reviewed. --------- Co-authored-by: Maël Nison <[email protected]>
1 parent 5046750 commit 092f059

File tree

4 files changed

+271
-6
lines changed

4 files changed

+271
-6
lines changed

.yarn/versions/1137015d.yml

+34
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
releases:
2+
"@yarnpkg/cli": minor
3+
"@yarnpkg/core": minor
4+
5+
declined:
6+
- "@yarnpkg/plugin-compat"
7+
- "@yarnpkg/plugin-constraints"
8+
- "@yarnpkg/plugin-dlx"
9+
- "@yarnpkg/plugin-essentials"
10+
- "@yarnpkg/plugin-exec"
11+
- "@yarnpkg/plugin-file"
12+
- "@yarnpkg/plugin-git"
13+
- "@yarnpkg/plugin-github"
14+
- "@yarnpkg/plugin-http"
15+
- "@yarnpkg/plugin-init"
16+
- "@yarnpkg/plugin-interactive-tools"
17+
- "@yarnpkg/plugin-link"
18+
- "@yarnpkg/plugin-nm"
19+
- "@yarnpkg/plugin-npm"
20+
- "@yarnpkg/plugin-npm-cli"
21+
- "@yarnpkg/plugin-pack"
22+
- "@yarnpkg/plugin-patch"
23+
- "@yarnpkg/plugin-pnp"
24+
- "@yarnpkg/plugin-pnpm"
25+
- "@yarnpkg/plugin-stage"
26+
- "@yarnpkg/plugin-typescript"
27+
- "@yarnpkg/plugin-version"
28+
- "@yarnpkg/plugin-workspace-tools"
29+
- "@yarnpkg/builder"
30+
- "@yarnpkg/doctor"
31+
- "@yarnpkg/extensions"
32+
- "@yarnpkg/nm"
33+
- "@yarnpkg/pnpify"
34+
- "@yarnpkg/sdks"

packages/yarnpkg-core/sources/Cache.ts

+12-6
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ import fs from 'fs';
77

88
import {Configuration} from './Configuration';
99
import {MessageName} from './MessageName';
10+
import {RefCountedCache, RefCountedCacheEntry} from './RefCountedCache';
1011
import {ReportError} from './Report';
1112
import * as hashUtils from './hashUtils';
1213
import * as miscUtils from './miscUtils';
@@ -72,6 +73,8 @@ export class Cache {
7273
checksum: string | null,
7374
]>> = new Map();
7475

76+
private refCountedZipFsCache = new RefCountedCache<string, ZipFS>(zipFs => zipFs.discardAndClose);
77+
7578
/**
7679
* To ensure different instances of `Cache` doesn't end up copying to the same
7780
* temporary file this random ID is appended to the filename.
@@ -467,14 +470,17 @@ export class Cache {
467470
if (!shouldMock)
468471
this.markedFiles.add(cachePath);
469472

470-
let zipFs: ZipFS | undefined;
473+
const createRefCount = () => this.refCountedZipFsCache.addOrCreate(cachePath, () => {
474+
return shouldMock
475+
? makeMockPackage()
476+
: new ZipFS(cachePath, {baseFs, readOnly: true});
477+
});
471478

472-
const zipFsBuilder = shouldMock
473-
? () => makeMockPackage()
474-
: () => new ZipFS(cachePath, {baseFs, readOnly: true});
479+
let refCountedCacheEntry: RefCountedCacheEntry<ZipFS> | undefined;
475480

476481
const lazyFs = new LazyFS<PortablePath>(() => miscUtils.prettifySyncErrors(() => {
477-
return zipFs = zipFsBuilder();
482+
refCountedCacheEntry = createRefCount();
483+
return refCountedCacheEntry.value;
478484
}, message => {
479485
return `Failed to open the cache entry for ${structUtils.prettyLocator(this.configuration, locator)}: ${message}`;
480486
}), ppath);
@@ -484,7 +490,7 @@ export class Cache {
484490
const aliasFs = new AliasFS(cachePath, {baseFs: lazyFs, pathUtils: ppath});
485491

486492
const releaseFs = () => {
487-
zipFs?.discardAndClose();
493+
refCountedCacheEntry?.release();
488494
};
489495

490496
// We hide the checksum if the package presence is conditional, because it becomes unreliable
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
export type RefCountedCacheEntry<TValue> = {
2+
value: TValue;
3+
release: () => void;
4+
};
5+
6+
/**
7+
* A cache map with reference counting. This map is designed to handle
8+
* a resource that has native/wasm handles which need to be release explicitly.
9+
* It also requires the value to have a unique map to cache instanches
10+
*/
11+
export class RefCountedCache<TKey, TValue> {
12+
private map = new Map<TKey, {value: TValue, refCount: number}>();
13+
14+
/**
15+
* Creates a new RefCountedMap.
16+
* @param releaseFunction The function to release the native resources.
17+
*/
18+
constructor(private releaseFunction: (value: TValue) => void) {
19+
}
20+
21+
/**
22+
*
23+
* @param key A unique key to indentify the instance in this Map
24+
* @param createInstance The function to create a new instance of TValue if none already esists
25+
* @returns The value form the cache (or newly created when not present) as well as the release function
26+
* to call when the object is to be released.
27+
*/
28+
addOrCreate(key: TKey, createInstance: () => TValue): RefCountedCacheEntry<TValue> {
29+
const result = this.map.get(key);
30+
31+
if (typeof result !== `undefined`) {
32+
if (result.refCount <= 0)
33+
throw new Error(`Race condition in RefCountedMap. While adding a new key the refCount is: ${result.refCount} for ${JSON.stringify(key)}`);
34+
35+
result.refCount++;
36+
37+
return {
38+
value: result.value,
39+
release: () => this.release(key),
40+
};
41+
} else {
42+
const newValue = createInstance();
43+
44+
this.map.set(key, {
45+
refCount: 1,
46+
value: newValue,
47+
});
48+
49+
return {
50+
value: newValue,
51+
release: () => this.release(key),
52+
};
53+
}
54+
}
55+
56+
/**
57+
* Releases the object by decreasing the refcount. When the last reference is released (i.e. the refcount goes to 0)
58+
* This function will call to the releaseFunction passed to the cache map to release the native resources.
59+
*/
60+
private release(key: TKey): void {
61+
const result = this.map.get(key);
62+
if (!result)
63+
throw new Error(`Unbalanced calls to release. No known instances of: ${JSON.stringify(key)}`);
64+
65+
const refCount = result.refCount;
66+
if (refCount <= 0)
67+
throw new Error(`Unbalanced calls to release. Too many release vs alloc refcount would become: ${refCount - 1} of ${JSON.stringify(key)}`);
68+
69+
if (refCount == 1) {
70+
this.map.delete(key);
71+
this.releaseFunction(result.value);
72+
} else {
73+
result.refCount--;
74+
}
75+
}
76+
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,149 @@
1+
import {RefCountedCache} from '../sources/RefCountedCache';
2+
3+
describe(`RefCountedCache`, () => {
4+
it(`should create value on first create`, () => {
5+
const actions: Array<string> = [];
6+
const cache = new RefCountedCache<string, string>((id => actions.push(`release ${id}`)));
7+
8+
const result = cache.addOrCreate(`a`, () => {
9+
const result = `create a-1`; actions.push(result); return result;
10+
});
11+
12+
expect(result.value).toBe(`create a-1`);
13+
expect(actions).toStrictEqual([`create a-1`]);
14+
});
15+
16+
it(`should release single value`, () => {
17+
const actions: Array<string> = [];
18+
const cache = new RefCountedCache<string, string>((id => actions.push(`release ${id}`)));
19+
20+
const result = cache.addOrCreate(`a`, () => {
21+
const result = `create a-1`; actions.push(result); return result;
22+
});
23+
24+
result.release();
25+
expect(actions).toStrictEqual([`create a-1`, `release create a-1`]);
26+
});
27+
28+
it(`should return first created value and only release on the last value`, () => {
29+
const actions: Array<string> = [];
30+
const cache = new RefCountedCache<string, string>((id => actions.push(`release ${id}`)));
31+
32+
const result1 = cache.addOrCreate(`a`, () => {
33+
const result = `create a-1`; actions.push(result); return result;
34+
});
35+
36+
expect(result1.value).toBe(`create a-1`);
37+
expect(actions).toStrictEqual([`create a-1`]);
38+
39+
// Creating new value with same key should reuse the previous value.
40+
const result2 = cache.addOrCreate(`a`, () => {
41+
const result = `create a-2`; actions.push(result); return result;
42+
});
43+
44+
expect(result2.value).toBe(`create a-1`);
45+
expect(actions).toStrictEqual([`create a-1`]);
46+
47+
// releasing one should not call release function
48+
result1.release();
49+
expect(actions).toStrictEqual([`create a-1`]);
50+
51+
// releasing second should call release, but on the first created instance.
52+
result2.release();
53+
expect(actions).toStrictEqual([`create a-1`, `release create a-1`]);
54+
});
55+
56+
it(`should handle multiple keys single value`, () => {
57+
const actions: Array<string> = [];
58+
const cache = new RefCountedCache<string, string>((id => actions.push(`release ${id}`)));
59+
60+
const result1 = cache.addOrCreate(`a`, () => {
61+
const result = `create a-1`; actions.push(result); return result;
62+
});
63+
64+
result1.release();
65+
66+
const result2 = cache.addOrCreate(`b`, () => {
67+
const result = `create b-2`; actions.push(result); return result;
68+
});
69+
70+
result2.release();
71+
72+
const result3 = cache.addOrCreate(`c`, () => {
73+
const result = `create c-3`; actions.push(result); return result;
74+
});
75+
76+
cache.addOrCreate(`d`, () => {
77+
const result = `create d-4`; actions.push(result); return result;
78+
});
79+
80+
const result5 = cache.addOrCreate(`e`, () => {
81+
const result = `create e-5`; actions.push(result); return result;
82+
});
83+
84+
result5.release();
85+
// skipping release 4 release
86+
result3.release();
87+
88+
expect(actions).toStrictEqual([
89+
`create a-1`,
90+
`release create a-1`,
91+
`create b-2`,
92+
`release create b-2`,
93+
`create c-3`,
94+
`create d-4`,
95+
`create e-5`,
96+
`release create e-5`,
97+
`release create c-3`,
98+
]);
99+
});
100+
101+
it(`should can create new instances after removing releasing value`, () => {
102+
const actions: Array<string> = [];
103+
const cache = new RefCountedCache<string, string>((id => actions.push(`release ${id}`)));
104+
105+
const result1 = cache.addOrCreate(`a`, () => {
106+
const result = `create a-1`; actions.push(result); return result;
107+
});
108+
109+
const result2 = cache.addOrCreate(`a`, () => {
110+
const result = `create a-2`; actions.push(result); return result;
111+
});
112+
113+
result1.release();
114+
result2.release();
115+
116+
const result3 = cache.addOrCreate(`a`, () => {
117+
const result = `create a-3`; actions.push(result); return result;
118+
});
119+
120+
const result4 = cache.addOrCreate(`a`, () => {
121+
const result = `create a-4`; actions.push(result); return result;
122+
});
123+
124+
result4.release();
125+
result3.release();
126+
127+
expect(actions).toStrictEqual([
128+
`create a-1`,
129+
`release create a-1`,
130+
`create a-3`,
131+
`release create a-3`,
132+
]);
133+
});
134+
135+
it(`should throw when releasing too many times`, () => {
136+
const actions: Array<string> = [];
137+
const cache = new RefCountedCache<string, string>((id => actions.push(`release ${id}`)));
138+
139+
const result1 = cache.addOrCreate(`a`, () => {
140+
const result = `create a-1`; actions.push(result); return result;
141+
});
142+
143+
result1.release();
144+
145+
expect(() => {
146+
result1.release();
147+
}).toThrow(/No known instances of: "a"/);
148+
});
149+
});

0 commit comments

Comments
 (0)