Skip to content

Commit b8a11eb

Browse files
committed
Don't recrawl identical URI/resource pairs.
Shaves ~10% off the suite runtime, even though we don't have more formal benchmarks in place.
1 parent cb76033 commit b8a11eb

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

referencing/_core.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -169,8 +169,9 @@ def with_identified_resources(
169169
uncrawled = self._uncrawled.evolver()
170170
contents = self._contents.evolver()
171171
for uri, resource in pairs:
172+
if uri not in self._contents or self._contents[uri][0] != resource:
173+
uncrawled.add(uri)
172174
contents[uri] = resource, m()
173-
uncrawled.add(uri)
174175
return evolve(
175176
self,
176177
contents=contents.persistent(),

0 commit comments

Comments
 (0)