-
Notifications
You must be signed in to change notification settings - Fork 77
Remove repr(C) from Mutator #1318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
If struct ImmixMutator {
#[allocator]
immix_allocator: ImmixAllocator,
#[parent]
common: CommonMutator,
}
impl MutatorContext for ImmixMutator {
fn alloc(...) {
match semantics {
AllocatorSemantics::Default => immix_allocator.alloc(...),
AllocatorSemantics::Los => common.los_allocator.alloc(...),
...
}
}
fn prepare(...) {...}
fn release(...) {...}
fn on_destroy(...) {...}
} The concept of |
It was intended not to use the old approach. One of the designs was that we want the mutator struct to have a fixed size for all the plans. This is essential to allow dynamic plan selection. It also makes things easier for the bindings, as the fixed size won't change often, and the bindings that embed the struct don't need to worry about the size in most of the times. |
The mutator struct is designed to be accessible by the native code in the bindings. A binding may embed the struct, use a pointer to the struct, or only store the fastpath data structures. As long as they need to access I don't think we can remove |
I don't see how a fixed-size But I remember that we did think so when we deviated from the JikesRVM approach. Maybe after years of engineering, we know that some of our assumptions were not true after all.
This is based on the premise that the VM binding needs to embed the
This is my exact concern. I don't think the VM binding can use any part of
The
And the So the status quo is that only |
My assumption is that the above options are all valid uses of MMTk. We don't force users to use Based on this, the mutator size needs to be constant, and the |
OK. I see where we disagree. I'd like to explicitly outlaw the practice of embedding the Mutator because there is no meaningful way to access most of its fields in C/C++, and it brings more troubles than benefits. Even if the user worry about one level of indirection, we may allow them to embed, but we should still make And I think maybe it is an orthogonal issue to turn |
We discussed this today in our meeting. We should first refactoring the OpenJDK binding, eliminating the practice of replicating and embedding the entire |
If we want to disallow embedding mutator struct in the binding side, we need to see at least OpenJDK working with the new model proposed (using fastpath data structures). OpenJDK implements all the plans in MMTk, we need to make sure the idea works out well when all the plans are supported. |
Theoratically, if we replace the current mutator (with allocators) with fastpath data structures, we would end up with a new mutator struct: struct MutatorFastpath {
allocators: AllocatorsFastpath,
...
}
struct AllocatorsFastpath {
pub bump_pointer: [MaybeUninit<BumpPointer>; MAX_BUMP_ALLOCATORS],
pub free_list: [MaybeUninit<FreeList<VM>>; MAX_FREE_LIST_ALLOCATORS],
...
} It is just smaller than the old
|
The reason why the
Mutator
struct was made#[repr(C)]
was to make it possible for bindings to embedMutator
into their own C/C++ structs in thread-local storage. This is supposed to improve performance by allowing inlined fast-path code to access important fields (such asBumpPointer
) quickly from the TLS. However, embedding the entireMutaotr
is neither practical nor necessary.It is impractical because
Mutator
struct is so big, it is beyond the addressable range of immediate addressing in some architectures, such as RISC-V. There is already a proposal to makeMutator
smaller.Mutator
struct. This is tedious and error-prone, but both the OpenJDK and the Julia binding have done that. And the presence of Rust-specific data types, such as&dyn T
, makes it hard to get the object layout right.repr(C)
, theAllocators
sub-struct has to rely on fixed-size arrays[T; N]
. By doing this, the number of allocators of each kind are limited and unknown at run time.Mutator
, resulting in a series of cascading changes in VM bindings that depend on the precise layout, such as this PR and its associated changes in the OpenJDK and Julia bindings.Allocators
struct has to useMaybeUninit
(like[MaybeUninit<BumpAllocator<VM>>; MAX_BUMP_ALLOCATORS]
) so that array elements can remain uninitialized if the concrete plan doesn't need that many allocators of that kind. This is also the reason why theAllocators::get_allocator
andAllocators::get_allocator_mut
have to be madeunsafe
.And it is not necessary to embed the entier
Mutator
into the TLS. Actually, the allocation fast path only needs theBumpPointer
struct. The VM binding only needs to embed theBumpPointer
struct. This practice is already documented in the Porting Guide, and there is an API functionAllocatorInfo::new(selector)
for getting the offset of theBumpPointer
struct. The Ruby binding didn't embed theBumpPointer
, but just lets the TLS keep a pointer to theBumpPointer
inside theMutator
without depending on the concrete structure ofMutator
, and the ART binding manually synchronizes theBumpPointer
in theMutator
with the thread-local storage.The proposal: Remove
repr(C)
fromMutator
We remove the
#[repr(C)]
annotation onMutator
. The C/C++ part of the VM binding can no longer assume the layout ofMutator
. But we keep a bottom line that:BumpPointer
), we provide an API function for returning the pointer to (or offset of) that structure so that the VM binding can get the pointer without knowing the layout of Mutator. Currently, this can be achieved with eithermutator_address + AllocatorInfo::new(selector).bump_pointer_offset
or&mutator.allocator_impl<ConcreteAllocator>(selector).bump_pointer
.BumpPointer
) to be taken out of theMutator
and put back later. This will allow the VM binding to keep the performance-sensitive parts in the TLS as long as possible because we assume they will be used by the inlined fast paths most of the time.The
Allocators
structThe
struct Allocators
no longer needs to berepr(C)
, either. It wasrepr(C)
because it was part ofMutator
.Because it doesn't need to be
repr(C)
, its layout no longer needs to be exposed to the VM binding. Now we can useVec<T>
instead of[T; N]
, for example,The constants
MAX_BUMP_ALLOCATORS
,MAX_IMMIX_ALLOCATORS
, etc. will be removed. VM bindings no longer needs to ensure it matches the definition in mmtk-core.When creating a
Mutator
, we still need to populate theAllocators
struct according to the plan-specificSpaceMapping
. But this time it can be easier because we can just useVec::push
instead ofMaybeUninit::write
.The method
Allocator::get_allocator
andAllocator::get_allocator_mut
can be made safe. It is currently unsafe because they useMaybeUninit::assume_init_ref()
andMaybeUninit::assume_init_mut()
, and neither of them have the ability to check whether a givenMaybeUninit<T>
is actually initialized or not. With the list of allocators becomingVec
, we can simply use theVec::get()
andVec::get_mut()
methods which returnOption<&T>
andOption<&mut T>
, respectively. This will reduce up to 15 call sites in allocators and mutator prepare/release functions that involve theunsafe
keyword. We'll discuss this in a separate issue.Other fields
Other fields need no changes. But we no longer need to label them as
#[repr(C)]
. Concretely,The
Mutator::barrier
field is the only one that needs some discussion. It is currently aBox<dyn Barrier>
, and it can remain this way. Because the concrete barrier to use depends on the Plan which is selected at start-up time, some form of dynamic dispatch is always needed.We previously discussed optimizing for write-barrier slow paths by embedding some data structures in TLS. Currently we only have an object-remembering barrier (and a field-remembering barrier in the
lxr
branch). To implement those kinds of barriers, we only need a bump-pointer, too. In this case, it is acursor: *mut ObjectReference
and alimit: *const ObjectReference
, which is basically a bump-pointer into the ModBuf. Because this is strictly an optimization over the status quo, we may consider it as a separate issue.Performance
The only thing that may affect performance is the
Vec<BumpAllocator<VM>>
and other vector of allocators in theAllocators
struct. We consider the overhead ofVec
reasonable because we will only access the complete allocators in slow paths. But we need to measure it anyway.API changes
Because allocators are now held in
Vec
, the "offset from the start ofMutator
to the concreteImmixAllocator
" and the "offset from the start ofMutator
to aBumpPointer
" no longer make sense. However, given a validAllocatorSelector
, we can still uniquely identify the address of the allocator. TheAllocator::get_allocator
method will still work, butAllocatorInfo
won't work. We may refactorAllocatorInfo
so that it returns references to the underlyingBumpPointer
instances rather than the offsets.It is debatable whether we still allow a mutator thread's TLS to keep a pointer to the
BumpPointer
. Currently we give the VM binding full control over theMutator
as if it were a C struct. After this refactoring, theBox<Mutator>
returned frommemory_manager::bind_mutator()
will be the only owning reference of theMutator
, and Rust doesn't like having another mutable reference into it (which violates the "unique reference" semantics ofBox<T>
and&mut T
). Currently this only affects the Ruby binding. The ART binding is unaffected because it is moving theBumpPointer
out of the allocator for fast-path allocation, and put it back when falling back to slow path.Documentation
We should update our porting guide and remove the section about embedding the entire
Mutator
struct. The VM bindings should either not embed anything or embed only theBumpPointer
struct.The text was updated successfully, but these errors were encountered: