Skip to content

Commit aa53fd0

Browse files
committed
Slightly speed up the contains keyword.
Saves some unnecessary repeated validator re-creation while validating arrays. In a quick benchmark (added here) and on my local machine (an M2 Mini) this goes from: ``` baseline: Mean +- std dev: 3.55 us +- 0.04 us beginning: Mean +- std dev: 3.37 ms +- 0.02 ms middle: Mean +- std dev: 3.37 ms +- 0.03 ms end: Mean +- std dev: 3.36 ms +- 0.02 ms invalid: Mean +- std dev: 3.40 ms +- 0.02 ms ``` to: ``` baseline: Mean +- std dev: 4.27 us +- 0.05 us beginning: Mean +- std dev: 2.65 ms +- 0.01 ms middle: Mean +- std dev: 2.66 ms +- 0.02 ms end: Mean +- std dev: 2.67 ms +- 0.02 ms invalid: Mean +- std dev: 2.70 ms +- 0.02 ms ``` on the included example (synthetic of course, but not ridiculously so). (The lack of difference in timing for how far into the array we get before finding a match seems interesting but probably requires a benchmark with a more interesting subschema we're matching on).
1 parent c9e2029 commit aa53fd0

File tree

3 files changed

+36
-1
lines changed

3 files changed

+36
-1
lines changed

Diff for: CHANGELOG.rst

+5
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,8 @@
1+
v4.21.1
2+
=======
3+
4+
* Slightly speed up the ``contains`` keyword by removing some unnecessary validator (re-)creation.
5+
16
v4.21.0
27
=======
38

Diff for: jsonschema/_keywords.py

+3-1
Original file line numberDiff line numberDiff line change
@@ -95,8 +95,10 @@ def contains(validator, contains, instance, schema):
9595
min_contains = schema.get("minContains", 1)
9696
max_contains = schema.get("maxContains", len(instance))
9797

98+
contains_validator = validator.evolve(schema=contains)
99+
98100
for each in instance:
99-
if validator.evolve(schema=contains).is_valid(each):
101+
if contains_validator.is_valid(each):
100102
matches += 1
101103
if matches > max_contains:
102104
yield ValidationError(

Diff for: jsonschema/benchmarks/contains.py

+28
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
"""
2+
A benchmark for validation of the `contains` keyword.
3+
"""
4+
5+
from pyperf import Runner
6+
7+
from jsonschema import Draft202012Validator
8+
9+
schema = {
10+
"type": "array",
11+
"contains": {"const": 37},
12+
}
13+
validator = Draft202012Validator(schema)
14+
15+
size = 1000
16+
beginning = [37] + [0] * (size - 1)
17+
middle = [0] * (size // 2) + [37] + [0] * (size // 2)
18+
end = [0] * (size - 1) + [37]
19+
invalid = [0] * size
20+
21+
22+
if __name__ == "__main__":
23+
runner = Runner()
24+
runner.bench_func("baseline", lambda: validator.is_valid([]))
25+
runner.bench_func("beginning", lambda: validator.is_valid(beginning))
26+
runner.bench_func("middle", lambda: validator.is_valid(middle))
27+
runner.bench_func("end", lambda: validator.is_valid(end))
28+
runner.bench_func("invalid", lambda: validator.is_valid(invalid))

0 commit comments

Comments
 (0)