-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
Add "localized world age" fast-path to staticdata.jl
#58040
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
619ca3d
to
2018496
Compare
Not much special support is required here, except that "extending external Methods" will need to track their updates to external MethodTable's `local_age` regardless of whether the extending Methods actually survive pre-compilation. That work is left TODO for now, but it is a JuliaLang#265 bug waiting to happen.
As a caveat, this change may significantly expand our edge vectors, since they are mostly dominated by `1-edge` Code / MethodInstance entries. These have now unfortunately doubled in size to a `UInt` (local age), followed by the CI / MI edge.
If the MethodTable has seen no updates / registrations other than those that were present when we queried against it originally, we can fast-track call verification.
2018496
to
0517cae
Compare
Copilot suggests the following change:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a really interesting idea. I've been trying to figure out a way that I can confuse the local_age
by using sneaky tricks of loading packages in different orders, but so far I think you're OK: the combination of the fact that age always increases, that precompilation always occurs in a minimal world, and the fact that you don't actually need a specific local age for each method seems (as far as I can tell) to make this safe.
Divorcing local_age
from the session's world age will also make this more broadly applicable. So this seems sound to me. Really nice contribution!
for i in 1:length(edges) | ||
if edges[i] === nmatches && edges[i+1] == info.atype | ||
for i in 1:(length(edges)-2) | ||
if edges[i] === (nmatches::Int) && edges[i+2] == info.atype |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason not to hoist this typeassert out of the loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a necessary type-assert, but it's to make it explicit that we're testing both the type and the value of the entry in the edge list (a UInt
is a local age, but an Int
is the number of MethodInstances in an edge bundle)
This version is probably clearer:
if edges[i] === (nmatches::Int) && edges[i+2] == info.atype | |
if edges[i] isa Int && edges[i] == nmatches && edges[i+2] == info.atype |
edgeᵢ isa CodeInstance && (edgeᵢ = get_ci_mi(edgeᵢ)) | ||
edgeᵢ isa MethodInstance || (i += 1; continue) | ||
if edgeᵢ === edge && !(i > 1 && edges[i-1] isa Type) | ||
return # found existing covered edge | ||
end | ||
i += 1 | ||
end | ||
mt_age !== nothing && push!(edges, mt_age) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know this code well, but I'm confused about why the condition doesn't cause edges
to go out of register.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by out of register?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nvm, I misread. I thought it was always skip-3, but it's not.
|
||
## Localized World Ages | ||
|
||
A "localized world age" is similar, but it only tracks changes to specific portions of the method table. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think of "the method table" as meaning the table for a specific function, and thus I initially interpreted this as something specific to a subset of the signatures in a given MethodTable
. Maybe you mean "specific portions of the global method cache"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point.
"cache" is not quite what I mean either, since we are not allowed to drop these entries, but I think "portions of the global dispatch table" is probably closer to what I mean
Nice! A while back Jameson and I were bouncing an idea back and forth of "opaque" world ages and defining them in terms of "union of visible things", we struggled with how to write efficient queries, but IIUC "local world age" is a step in this direction. Essentially, a world-age is a (still linearized) union of all "local world" ages. |
This PR introduces the notion of "localized world age" (see the added devdocs) and adds a fast path to staticdata.jl
On my machine, this saves us ~420k of the ~720k method look-ups that we perform (~58%) for loading "using CairoMakie".
The load-time improvements are more modest (about ~1-1.5 seconds) since these were already very fast (~3 μs on average) and we still spend a lot of time walking to these edges that we don't need to verify.
I suspect there are more gains to be had here. The fast-path is moderately warm, but most of the non-fast-pathed calls are targeting "dense" MethodTables, like
(::Type)(...)
or highly ad-hoc-polymorphic tables likehaskey(...)
which will be implemented over many user types.It should be possible to generalize this to have sub-MethodTable granularity like this (numbers made up):
which would make many of those eligible for the fast path at the cost of an extra lookup or two.
Nonetheless, I wanted to open this first to get input (esp. @vtjnash's and @timholy's) and see if this approach is sound (and to work out how to best do the edge encoding etc. - it's not very efficient right now)