Skip to content

Commit 50414c3

Browse files
authored
fix(BREAKING CHANGE): dereference caching to prevent infinite loops on circular schemas (#380)
We've come across an interesting case with one of our customers OpenAPI definitions where due to the amount of circular references they are using it ends up sending `$RefParser.dereference()` into an infinite loop that has unfortunately been causing us some pain. The following is a video of running the `dereference.circular=ignore` test in this PR without my eventual fix: https://github.com/user-attachments/assets/077129bd-997a-40b7-aa57-8129bd7df87f I've killed the process after 20 seconds but we've seen cases where it can run for almost an hour trying to dereference this circular API definition. In investigating this we've identified a couple issues: 1. When dereferencing this specific schema the event loop gets blocked in the process preventing the `options.timeoutMs` check and exception from ever getting hit. * We were able to resolve this by adding a supplementary timeout check when the dereference crawler processes each property within an object. 2. The dereference cache isn't being fully taken advantage of. In investigating the cache issues we noticed a couple more issues: #### Core issues with `Set` caches The `Set` objects that are used for `parents` and `processedObjects` don't appear to be fully utilized because these are often setting objects, and `Set` does not do object deuping: ```js const set = new Set(); set.add({ type: 'string' }); set.add({ type: 'string' }); console.log({ set: Array.from(set), has: set.has({ type: 'string'} )}) > {set: Array(2), has: false} ``` I'm not convinced that any of the `.has()` checks being executed on these stores are currently working and I made an attempt at pulling in [flatted](https://npm.im/flatted)[^1] to create consistent and unique keys off of these values however a couple unit tests broken in unexpected ways and ended up moving on. I would love to spend some more time investigating this because I think in extreme cases like ours we could really improve memory usage during dereferencing. #### The dereference cache is only being used for non-circular objects After crawling to dereferencing an object and either updating the original schema or resetting it to the original `$ref` if `dereference.circular=ignore` is set that resulting object is saved into the dereferenced cache `Map`. This map is referenced at the beginning of the `dereference$Ref` function however the cache is only utilized if the found object is **not** circular. I was unable to uncover a reason why this was the case but without this cache being utilized this dereferencer would continuously, even despite `dereference.circular=ignore` being configured, crawl and re-dereference objects it had already seen. Changing this logic to always return the cache if we found something brought dereferencing our use case down from ∞ to less than a second. https://github.com/user-attachments/assets/4d38a619-7b0b-4ec8-8f10-a3e728855a84 <sub>Ignore the logs in this video, it was recorded while I was still working on this fix.</sub> [^1]: Because `JSON.stringify()` cannot serialize circular objects. BREAKING CHANGE: dereference caching to prevent infinite loops on circular schemas
1 parent da0fda2 commit 50414c3

File tree

4 files changed

+2956
-11
lines changed

4 files changed

+2956
-11
lines changed

lib/dereference.ts

+32-6
Original file line numberDiff line numberDiff line change
@@ -69,11 +69,8 @@ function crawl<S extends object = JSONSchema, O extends ParserOptions<S> = Parse
6969
circular: false,
7070
};
7171

72-
if (options && options.timeoutMs) {
73-
if (Date.now() - startTime > options.timeoutMs) {
74-
throw new TimeoutError(options.timeoutMs);
75-
}
76-
}
72+
checkDereferenceTimeout<S, O>(startTime, options);
73+
7774
const derefOptions = (options.dereference || {}) as DereferenceOptions;
7875
const isExcludedPath = derefOptions.excludedPathMatcher || (() => false);
7976

@@ -98,6 +95,8 @@ function crawl<S extends object = JSONSchema, O extends ParserOptions<S> = Parse
9895
result.value = dereferenced.value;
9996
} else {
10097
for (const key of Object.keys(obj)) {
98+
checkDereferenceTimeout<S, O>(startTime, options);
99+
101100
const keyPath = Pointer.join(path, key);
102101
const keyPathFromRoot = Pointer.join(pathFromRoot, key);
103102

@@ -214,7 +213,17 @@ function dereference$Ref<S extends object = JSONSchema, O extends ParserOptions<
214213
const $refPath = url.resolve(shouldResolveOnCwd ? url.cwd() : path, $ref.$ref);
215214

216215
const cache = dereferencedCache.get($refPath);
217-
if (cache && !cache.circular) {
216+
217+
if (cache) {
218+
// If the object we found is circular we can immediately return it because it would have been
219+
// cached with everything we need already and we don't need to re-process anything inside it.
220+
//
221+
// If the cached object however is _not_ circular and there are additional keys alongside our
222+
// `$ref` pointer here we should merge them back in and return that.
223+
if (cache.circular) {
224+
return cache;
225+
}
226+
218227
const refKeys = Object.keys($ref);
219228
if (refKeys.length > 1) {
220229
const extraKeys = {};
@@ -294,6 +303,23 @@ function dereference$Ref<S extends object = JSONSchema, O extends ParserOptions<
294303
return dereferencedObject;
295304
}
296305

306+
/**
307+
* Check if we've run past our allowed timeout and throw an error if we have.
308+
*
309+
* @param startTime - The time when the dereferencing started.
310+
* @param options
311+
*/
312+
function checkDereferenceTimeout<S extends object = JSONSchema, O extends ParserOptions<S> = ParserOptions<S>>(
313+
startTime: number,
314+
options: O,
315+
): void {
316+
if (options && options.timeoutMs) {
317+
if (Date.now() - startTime > options.timeoutMs) {
318+
throw new TimeoutError(options.timeoutMs);
319+
}
320+
}
321+
}
322+
297323
/**
298324
* Called when a circular reference is found.
299325
* It sets the {@link $Refs#circular} flag, executes the options.dereference.onCircular callback,
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,107 @@
1+
import { describe, it } from "vitest";
2+
import $RefParser from "../../../lib/index.js";
3+
import helper from "../../utils/helper.js";
4+
import path from "../../utils/path.js";
5+
6+
import { expect } from "vitest";
7+
8+
describe("Schema with an extensive amount of circular $refs", () => {
9+
it("should dereference successfully", async () => {
10+
const circularRefs = new Set<string>();
11+
12+
const parser = new $RefParser<Record<string, any>>();
13+
const schema = await parser.dereference(path.rel("test/specs/circular-extensive/schema.json"), {
14+
dereference: {
15+
onCircular: (ref: string) => circularRefs.add(ref),
16+
},
17+
});
18+
19+
// Ensure that a non-circular $ref was dereferenced.
20+
expect(schema.components?.schemas?.ArrayOfMappedData).toStrictEqual({
21+
type: "array",
22+
items: {
23+
type: "object",
24+
properties: {
25+
mappingTypeName: { type: "string" },
26+
sourceSystemValue: { type: "string" },
27+
mappedValueID: { type: "string" },
28+
mappedValue: { type: "string" },
29+
},
30+
additionalProperties: false,
31+
},
32+
});
33+
34+
// Ensure that a circular $ref **was** dereferenced.
35+
expect(circularRefs).toHaveLength(23);
36+
expect(schema.components?.schemas?.Customer?.properties?.customerNode).toStrictEqual({
37+
type: "array",
38+
items: {
39+
type: "object",
40+
properties: {
41+
customerNodeGuid: expect.any(Object),
42+
customerGuid: expect.any(Object),
43+
nodeId: expect.any(Object),
44+
customerGu: expect.any(Object),
45+
},
46+
additionalProperties: false,
47+
},
48+
});
49+
});
50+
51+
it("should dereference successfully with `dereference.circular` is `ignore`", async () => {
52+
const circularRefs = new Set<string>();
53+
54+
const parser = new $RefParser<Record<string, any>>();
55+
const schema = await parser.dereference(path.rel("test/specs/circular-extensive/schema.json"), {
56+
dereference: {
57+
onCircular: (ref: string) => circularRefs.add(ref),
58+
circular: "ignore",
59+
},
60+
});
61+
62+
// Ensure that a non-circular $ref was dereferenced.
63+
expect(schema.components?.schemas?.ArrayOfMappedData).toStrictEqual({
64+
type: "array",
65+
items: {
66+
type: "object",
67+
properties: {
68+
mappingTypeName: { type: "string" },
69+
sourceSystemValue: { type: "string" },
70+
mappedValueID: { type: "string" },
71+
mappedValue: { type: "string" },
72+
},
73+
additionalProperties: false,
74+
},
75+
});
76+
77+
// Ensure that a circular $ref was **not** dereferenced.
78+
expect(circularRefs).toHaveLength(23);
79+
expect(schema.components?.schemas?.Customer?.properties?.customerNode).toStrictEqual({
80+
type: "array",
81+
items: {
82+
$ref: "#/components/schemas/CustomerNode",
83+
},
84+
});
85+
});
86+
87+
it('should throw an error if "options.dereference.circular" is false', async () => {
88+
const parser = new $RefParser();
89+
90+
try {
91+
await parser.dereference(path.rel("test/specs/circular-extensive/schema.json"), {
92+
dereference: { circular: false },
93+
});
94+
95+
helper.shouldNotGetCalled();
96+
} catch (err) {
97+
expect(err).to.be.an.instanceOf(ReferenceError);
98+
expect(err.message).to.contain("Circular $ref pointer found at ");
99+
expect(err.message).to.contain(
100+
"specs/circular-extensive/schema.json#/components/schemas/AssignmentExternalReference/properties/assignment/oneOf/0",
101+
);
102+
103+
// $Refs.circular should be true
104+
expect(parser.$refs.circular).to.equal(true);
105+
}
106+
});
107+
});

0 commit comments

Comments
 (0)