Skip to content

Investigate lifting 16-bit class-id limit in Dart VM #42533

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mkustermann opened this issue Jun 30, 2020 · 9 comments
Closed

Investigate lifting 16-bit class-id limit in Dart VM #42533

mkustermann opened this issue Jun 30, 2020 · 9 comments
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends.

Comments

@mkustermann
Copy link
Member

The first customers have started to hit 16-bit class id limit in JIT mode. See b/160229360

We should investigate lifting our 16-bit class-id restriction and see what the code size and heap usage impact is.

We could possibly also investigate lifting the two word heap alignment requirement, thereby freeing up space in heap objects (on average). Possibly also shrinking identity hash code (or handling it differently).

@mkustermann mkustermann added the area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. label Jun 30, 2020
@rmacnak-google
Copy link
Contributor

Many years ago, the class id field was 16-bits on both 32- and 64-bit systems, and the extra 32-bits in the 64-bit header was wasted. Then I increased the class id and size fields on 64-bit systems (and added classid_t). Then Erik reduced the class id field back to 16-bits and used the extra 32-bits to store the identity hash.

On 64-bit systems, we could swap the sizes of the class id and identity hash fields if a 16-bit hash is acceptable.

@mraleph
Copy link
Member

mraleph commented Jun 30, 2020

Why do we assign class ids eagerly when we read classes? We should be able to assign them lazily when class is finalised for instantiation.

This does introduce certain amount of non-determinism and makes JITed code slower (potentially) due to more random range assignments, but it should probably considerably cut the size of the class table.

@mkustermann
Copy link
Member Author

Why do we assign class ids eagerly when we read classes?

IIRC One of the reasons is that isolate communication via snapshots embeds class ids in the messages. So if classes are assigned class ids lazily, two isolates can have different numbering and bad things will happen ...

Once we are done with isolate groups, we could make separate isolate groups only communicate via json-like types then we could remove this.

In general I think we just have to face reality that with the increasing success of flutter and dart, we'll have to run bigger and bigger apps and our 16-bit limitation might need to be lifted, sooner or later.

@a-siva
Copy link
Contributor

a-siva commented Jul 6, 2020

Many years ago, the class id field was 16-bits on both 32- and 64-bit systems, and the extra 32-bits in the 64-bit header was wasted. Then I increased the class id and size fields on 64-bit systems (and added classid_t). Then Erik reduced the class id field back to 16-bits and used the extra 32-bits to store the identity hash.

On 64-bit systems, we could swap the sizes of the class id and identity hash fields if a 16-bit hash is acceptable.

How do we determine if a 16-bit hash is acceptable?

@mkustermann
Copy link
Member Author

How do we determine if a 16-bit hash is acceptable?

My impression is that it might not be acceptable - in particular for 64-bit systems (which is what this discussion is about), since we often have big heaps in 64-bit systems. Some of our own tools make heavy use of identity hash codes, where 16 bits might be a huge issue.

I was assigned on b/160229360 and started making some prototypes.

@mkustermann mkustermann self-assigned this Jul 6, 2020
@mkustermann
Copy link
Member Author

I've looked into not assigning any ids to the toplevel classes: Right now it seems that toplevel class ids are both used in the service protocol as well as in types (in connection with mirrors).

To unblock the customer, I think it's best to land cl/153607 first. I'll still spend a little more time investigating whether the issues mentioned above could be fixed. If so, we can revert cl/153607 and land an alternative solution.

dart-bot pushed a commit that referenced this issue Jul 9, 2020
Right now we assign class ids to top-level classes, abstract classes as
well as concrete classes. All of them have allocated from a 16-bit pool
of ids. The VM FATAL()s once it hits that limit.

Customers who run very large programs (significant amount of generated
code) on the Dart VM have started to hit this 16-bit class limit.

Concrete classes can have instances in the heap. Our current heap layout
only allows 16-bit class ids to be encoded in the header word. To avoid
increasing the size of heap objects or shrinking the size of the identity
hash code to 16-bit we keep class ids in object headers to be 16-bit.

Abstract classes cannot have instances in the heap. Though their class
ids are encoded in type objects. Furthermore we sort classes in
AOT/AppJIT mode to perform fast class-id range checks. To avoid impacting
this optimization we treat abstract classes the same way as concrete
classes.

Top-level classes cannot have instances in the heap. Their class ids are
only used in the runtime code, for example for hot-reload as well as
part of the service protocol.

=> We can allocate class ids outside the 16-bit range for top-level
classes, thereby freeing a significant amount of space in the 16-bit
range.

This CL does exactly that: We change classid_t to be int32_t. The
ClassLayout::id_ can now be assigned ids outside 16-bit range for
top-level classes. To do this we keep dart classes and top level classes
as separate arrays in the ClassTable.

Issue #42533

See also b/160229360

Change-Id: I6710a644e7b0ab2d4f4c792bef8e1f91cb117421
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/153607
Commit-Queue: Martin Kustermann <[email protected]>
Reviewed-by: Ryan Macnak <[email protected]>
@mkustermann
Copy link
Member Author

I'll still spend a little more time investigating whether the issues mentioned above could be fixed.

Made cl/156003 which shows the changes we would need to make for not assigning class ids to top-level classes (Ryan's idea).

Though the changes in that CL are not sufficient yet, there are still remaining places where we use the top-level class ids in type objects and expose those type object to Dart code, e.g. in NoSuchMethod handling, see e.g. object.cc:

    if (throw_nsm_if_absent) {
      return ThrowNoSuchMethod(
          AbstractType::Handle(Class::Handle(toplevel_class()).RareType()),
          getter_name, Object::null_array(), Object::null_array(),
          InvocationMirror::kTopLevel, InvocationMirror::kGetter);
    }

which calls NoSuchMethodError._throwNew() with the type object of a top-level class as receiver, see error_patch.dart:

@patch
class NoSuchMethodError {
  final Object? _receiver;
  final Invocation _invocation;

  @pragma("vm:entry-point", "call")
  static void _throwNew(Object receiver /* <= Type { type_class_id: top-level-cid } */, ...) {
    throw new NoSuchMethodError._withType(receiver, memberName, invocationType,
        typeArgumentsLength, typeArguments, arguments, argumentNames);
  }

As cl/156003 shows we would need to special-case top-level classes in various places. Not convinced that it's better than the current solution of assigning class ids to top-level classes but outside the 16-bit range. Will therefore stop the investigation of this approach here.

@mkustermann mkustermann removed their assignment Jul 27, 2020
dart-bot pushed a commit that referenced this issue Jul 27, 2020
Instead of using class ids when building field/function service ids, we use names instead
(as we do in some other places already).

Issue #42533

Change-Id: I55530161af26ee9514aa29a048cf140094cabd95
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/156004
Commit-Queue: Martin Kustermann <[email protected]>
Reviewed-by: Ryan Macnak <[email protected]>
Reviewed-by: Ben Konyi <[email protected]>
@kf6gpe
Copy link

kf6gpe commented Aug 11, 2020

cc @kf6gpe

tekknolagi pushed a commit to tekknolagi/dart-assembler that referenced this issue Nov 3, 2020
Right now we assign class ids to top-level classes, abstract classes as
well as concrete classes. All of them have allocated from a 16-bit pool
of ids. The VM FATAL()s once it hits that limit.

Customers who run very large programs (significant amount of generated
code) on the Dart VM have started to hit this 16-bit class limit.

Concrete classes can have instances in the heap. Our current heap layout
only allows 16-bit class ids to be encoded in the header word. To avoid
increasing the size of heap objects or shrinking the size of the identity
hash code to 16-bit we keep class ids in object headers to be 16-bit.

Abstract classes cannot have instances in the heap. Though their class
ids are encoded in type objects. Furthermore we sort classes in
AOT/AppJIT mode to perform fast class-id range checks. To avoid impacting
this optimization we treat abstract classes the same way as concrete
classes.

Top-level classes cannot have instances in the heap. Their class ids are
only used in the runtime code, for example for hot-reload as well as
part of the service protocol.

=> We can allocate class ids outside the 16-bit range for top-level
classes, thereby freeing a significant amount of space in the 16-bit
range.

This CL does exactly that: We change classid_t to be int32_t. The
ClassLayout::id_ can now be assigned ids outside 16-bit range for
top-level classes. To do this we keep dart classes and top level classes
as separate arrays in the ClassTable.

Issue dart-lang#42533

See also b/160229360

Change-Id: I6710a644e7b0ab2d4f4c792bef8e1f91cb117421
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/153607
Commit-Queue: Martin Kustermann <[email protected]>
Reviewed-by: Ryan Macnak <[email protected]>
@rmacnak-google
Copy link
Contributor

9182d5e

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends.
Projects
None yet
Development

No branches or pull requests

5 participants