Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A mechanism for virtual overloads #693

Closed
dcodeIO opened this issue Jun 27, 2019 · 37 comments · Fixed by #1208
Closed

A mechanism for virtual overloads #693

dcodeIO opened this issue Jun 27, 2019 · 37 comments · Fixed by #1208

Comments

@dcodeIO
Copy link
Member

dcodeIO commented Jun 27, 2019

Currently, the compiler does not support virtual overloads of class members, as one would expect from JS, but instead resolves the respective member statically from the contextual class type. This is noted in the docs so far but we should aim at resolving this.

What's necessary here is to decide on a mechanism to resolve the virtual function or field at runtime, using the unique class ids we already have in RTTI plus, possibly, unique ids for each virtual member.

Let's say we have A#foo() with foo possibly being overloaded by a class B extends A, the implementation must call B#foo() here if the actual instance we are dealing with is a B, not an A. Interfaces are similar.

One mechanism I thought about so far is to use the unique class ids (here: of A and B), and give the foo member another internal unique id, so we can generate a runtime lookup function that performs a (memberId, classId) -> functionIndex or fieldOffset mapping through a compiler-generated big switch. The implementation on the compiler side should skip making a runtime lookup if it is not necessary (that is: the class has no subclasses or respective interfaces) for perf reasons. If it is necessary, calling a virtually overloaded functions becomes a call_indirect.

This isn't exactly scientific but is solely based on what I think would work, so if this problem has already been solved in better ways in the past, feel free to comment :)

@MaxGraey, @willemneal, @bowenwang1996

@MaxGraey
Copy link
Member

MaxGraey commented Jun 27, 2019

I'm also not pretty experienced in different approaches but agree with your notes and also add on my own:

  1. All subclasses share one vtable which create (link) in basic class constructor.
  2. vtable not create if:
    • basic class hasn't any subclasses or interfaces
    • all methods are private or just fields
  3. only non-private methods with identical name and signatures (?) added to vtable. Methods declared in interfaces or abstract classes always add to vtable for class which realize its.
  4. add @final decorator? 🤔

@bowenwang1996
Copy link
Contributor

Sounds reasonably to me. Let's see whether @willemneal has something to add.

@dcodeIO
Copy link
Member Author

dcodeIO commented Jun 28, 2019

All subclasses share one vtable which create (link) in basic class constructor.

In this case, the functions would look like

function MyClass~virtual(memberId: u32): u32 { // classId implicitly is idof<MyClass>
  switch (memberId) {
    case 0: return SOME_CONST;
    case 1: return SOME_CONST;
    case 2: return SOME_CONST;
    default: unreachable();
  }
}

Questions are:

  • Can we guarantee that member ids are adjacent? Looks like mixing interfaces into different classes will lead to holes.
  • How much redundant code would having one function per class yield (like, each subclass would need one case even if the member isn't overloaded deeper in the tree)? Will we be missing optimization opportunities compared to one large function where Binaryen can optimize a single big switch?

The alternative is to reverse classId/memberId by using a single function that first switches over memberId, and only then over classId:

function ~virtual(memberId: u32, classId: u32): u32 {
  switch (memberId) { // guaranteed to be adjacent because memberId is a global counter
    case 0: {
      switch (classId) {
        // might become a br_table or a series of ifs depending on classId distribution here
      }
    }
    case 1: {
      ...
    }
    default: unreachable();
  }
}

In this case, the most important question is whether such a function will optimize well. On a first glimpse this "looks" like Binaryen can rereloop a lot of stuff, but haven't seen it in action yet ofc. Hmm...

@willemneal
Copy link
Contributor

My first question is are we going to now implicitly inherit from an Object class?

Also do we want interfaces to be structural and nominal like typescript? Personally I think we should.

I think the per class method would be a good first step; you could use the class id as the index into the function table or a memory address it's

What do you mean by "each subclass would need one case even if the member isn't overloaded deeper in the tree".

I also have another question. When are class id's computed currently? After parsing? Or during compilation?

@dcodeIO
Copy link
Member Author

dcodeIO commented Jul 2, 2019

My first question is are we going to now implicitly inherit from an Object class?

I think we should do this in the long term, yeah. If we can do this now, sure, why not.

Also do we want interfaces to be structural and nominal like typescript? Personally I think we should.

Depends if a static compiler can do that. Consider those two classes for example:

class Foo { a: i32; b: f32; }
class Bar { b: f32; a: i32; }

These are statically incompatible even though structurally compatible, but interfaces might not suffer from this due to all members being virtual. So I guess that there might be ways to make something structurally compatible sometimes, not but every time, which leaves the question whether we should do it at all. Might even be that structural things like these don't work well with static typing.

What do you mean by "each subclass would need one case even if the member isn't overloaded deeper in the tree".

In a single global function there can be switch fall-throughs, while if there's one function per class every function must either implement all the cases up the tree or fall-through by means of calling the overloaded virtual table function, leading to more code overhead than just a big switch (which might also optimize better because it's one large blob of code that can possibly take advantage of re-relooping).

When are class id's computed currently? After parsing? Or during compilation?

The class ids of ArrayBuffer, String, ArrayBufferView are generated when the program is initialized, as these are static, while all the others are generated as soon as a concrete class is resolved (from type arguments), which happens during compilation when the compiler sees the use of a type or expression that requires resolving a class.

@willemneal
Copy link
Contributor

So I guess that there might be ways to make something structurally compatible sometimes, not but every time

If we are going to make fields, e.i. getter functions, have a unique member ID, we will always be able to determine structurally compatibility.

Consider this interface:

interface AB {
  a: i32;
  b: f32;
}

Then both of the above classes would implement it structurally even if they aren't in the same order. If there is ever flow where either class could be used where an AB interface is expected, we would be able to test that each class contains the set of member ids of the interface. Go, typescript, and other languages have static structural typing so it's definitely possible. After being able to validate that the class implements the interface then its invocation the same as if it was nominal. E.g.

class Foo implements AB { a: i32; b: f32; } differs from class Bar implements AB { b:f32; a: i32;}

But the use of virtual functions means it doesn't matter.

I understand the want to keep everything in one function for compactness and potential optimizations, but one down side is a lack of modularity. For example, assuming we could merge wasm binaries (which will be a future runtime feature of the loader), we couldn't merge the two big virtual method lookup functions, whereas it would if each class is responsible for their table.

Perhaps we should add a step between parsing and compiling, where we resolve classes/interfaces. A special transform that starts at the entries and produces a condensed program object, which only contains statements that will be compiled and handle the type checking.

@dcodeIO dcodeIO pinned this issue Jul 6, 2019
@dcodeIO
Copy link
Member Author

dcodeIO commented Jul 6, 2019

Then both of the above classes would implement it structurally even if they aren't in the same order.

If field layout isn't statically compatible, accessing any of its members structurally must resort to doing a virtual lookup. So I assume what you are proposing here is to have something like hidden interfaces for class types that are used interchangeably in a structural way? This incurs some overhead (that might ofc be worth it).

I understand the want to keep everything in one function for compactness and potential optimizations, but one down side is a lack of modularity. For example, assuming we could merge wasm binaries (which will be a future runtime feature of the loader), we couldn't merge the two big virtual method lookup functions, whereas it would if each class is responsible for their table.

Not so sure about this point. There are already things that cannot be merged easily having just executables, like RC's __visit_members. My expectation there would be that merging requires at least enough meta information to first strip and then regenerate these features after the merge, so whether or not that'd be one or multiple functions isn't really important just for that. From a WebAssembly perspective, I think that static linking will assume two otherwise independent modules and make them work in one module (at least initially), while an AssemblyScript linker will reuse additional information to do it in a way that both modules join together to a single AS program.

More generally spoken, the way to go here seems to first get interfaces up and running, and only then think about structural compatibility of other types, like using hidden interfaces which we can always do. Does that sound reasonable?

@willemneal
Copy link
Contributor

Yeah that's what we should do for sure. My point was once we have the nominal interfaces, the implementing structural becomes easy. Assuming we have a virtual lookup for each method of a class that implements the interface, then all it would take is to add all classes that match it structurally as implementing the interface.

Regardless I think the place to start is what you first suggested:

function MyClass~virtual(memberId: u32): u32 { // classId implicitly is idof<MyClass>
  switch (memberId) {
    case 0: return SOME_CONST;
    case 1: return SOME_CONST;
    case 2: return SOME_CONST;
    default: unreachable();
  }
}

Each classid could correspond to an table entry that points to the class's virtual method like above.

Then each class has its methods in the table and function above would map the method ids to their table entries.

let func = call_indirect(load<u32>(ref - 8), memberid); // Calls MyClass~virtual.
call_indirect(func,...) // make actual function call.

Then after we get this to work interfaces are just classes with static methods that do the above look up on a constant memberid. After get some data about the performance we can revisit and use a different method.

Does this make sense?

@dcodeIO
Copy link
Member Author

dcodeIO commented Jul 9, 2019

Another pro for a single function might be that multiple levels of indirect calls can be avoided, since we don't have to look up the respective virtual function first. I also wonder if storing something (for methods) in class instance's memory can be avoided, e.g.

let func = ~virtual(memberId, classId); // direct call returning the function index, memberId is static
call_indirect(func, this, ...args) // make the actual function call

where ~virtual contains the statically known function indexes directly and avoids a second indirect call due to being a single big function. This seems like it can be more efficient, if I'm not missing something about storing in instance memory (please let me know if I do). Also, if it just so happens (memberId and classId are constant, even though classId usually isn't) that Binaryen can precompute the outcome of the ~virtual call, it can optimize the indirect to a direct call, but this might not necessary be possible in the single function case only.

@MaxGraey
Copy link
Member

MaxGraey commented Sep 15, 2019

How about Static Polymorphism? And CRTP which desugared (generated) by compiler?

@MaxGraey
Copy link
Member

MaxGraey commented Sep 15, 2019

So this code:

class B {
   i: i32 = 1;
   add(i: i32): void { // abstract method
   }
}

class D extends B {
    add(i: i32): void { // override subtyped method foo
       this.i += i;
    }
}

let boo: B = new D();
boo.foo(1);
log(boo.i); // should be 2

generate as:

class B {
   i: i32 = 1;
   add(i: i32): void { // abstract method
   }
}

class _B<T extends D> {
  i: i32 = 1;
  add(i: i32): void {
    changetype<T>(this).add(i); // generated in compile time
  }
}

class D extends _B<D> {
  add(i: i32): void { 
    this.i += i;
  }
}

let boo: _B<D> = new D(); // _B<D> generated in compile time
boo.add(1);
log(boo.i); // -> 2

Of course it as always speed / size trade-off so main disadvantage of this approach is quickly growing codesize and slower compilation but I guess this could be partially solved.

@dcodeIO
Copy link
Member Author

dcodeIO commented Sep 15, 2019

When we have a D down-cast to B, and pass that around to

function callMe(actuallyD: B): void {
  actuallyD.add(1);
}

how would the compiler not have to do a lookup there? I might be missing something, but CRTP seems like an alternative pattern that can be used in some situations in C++, but doesn't necessarily solve our inheritance model.

@MaxGraey
Copy link
Member

MaxGraey commented Sep 15, 2019

We don't cast to B actually. Instead we cast to statically generic class _B<D> which link to its supertype.

See this approach in C++ fiddle. I don't use virtual keyword.

@MaxGraey
Copy link
Member

MaxGraey commented Sep 15, 2019

Also article with benchmark

@dcodeIO
Copy link
Member Author

dcodeIO commented Sep 15, 2019

Isn't the use case there more like this, with interfaces, with B not an actual class?

@MaxGraey
Copy link
Member

MaxGraey commented Sep 15, 2019

That I guess called F-bounded quantification

@dcodeIO
Copy link
Member Author

dcodeIO commented Sep 15, 2019

What I'm trying to get at with my fiddle is that in your example the class B isn't an actual class but the recursive CRTP template, right? And D is a class that implements that interface, compiling to _B<D> statically. An actual second class E would compile to _B<E> statically. So what's covered here is exactly D implements B && E implements B, but not F extends E where down-casting can happen. As such, CRTP seems to be very specific to that exact "multiple classes, one interface, no subclasses" scenario.

For instance, how would that turn out?

class Foo {
  hi(): void {}
}

class Bar extends Foo {
  hi(): void {}
}

function callMe(actuallyBar: Foo): void {
  actuallyBar.hi();
}

@MaxGraey
Copy link
Member

MaxGraey commented Sep 15, 2019

Yeah, CRTP have some limitations as well as "static" instanceof. In rest scenario we could use double dispatch (two indirect calls) for example: https://gieseanw.wordpress.com/2018/12/29/stop-reimplementing-the-virtual-table-and-start-using-double-dispatch

I try imagine could we avoid vtable in some sort or this totally impossible)

@dcodeIO
Copy link
Member Author

dcodeIO commented Sep 15, 2019

Maybe it makes sense to elaborate a bit on my idea from above so we can compare performance of different alternative approaches to this one. What I was proposing is the following mechanism:

Every object in AssemblyScript has a classId that represents the top-most class of any managed object's inheritance hierarchy. Considering a scenario like this one:

class Foo { // classId=1
  hi(): void {} // memberId=1, placed at functionIndex=0
}

class Bar extends Foo { // classId=2
  hi(): void {} // memberId=1, placed at functionIndex=1
}

class Baz extends Bar {} // classId=3

function callMe(actuallyBar: Foo): void {
  actuallyBar.hi();
}

we can see inside of callMe that the classId of actuallyBar is 2=Bar (from a load<u32>(ptr - BLOCK_OVERHEAD)), not 1=Foo, and we know that #hi has memberId 1, which is the same for every overload of that method, and is known by the compiler statically.

We can generate the following lookup function

function ~virtual(classId=2, memberId=1): u32 {
  switch (memberId) {
    case 1: { // Foo|Bar|Baz#hi
      switch (classId) {
       case 1: return 0; // Foo#hi
       case 2: return 1; // Bar#hi
       case 3: return 1; // Bar#hi
      }
    }
  }
  unreachable();
  // the same goes for virtual fields, just that these don't return
  // a function table index, but a memory offset to load from
}

doing the following call inside of callMe

call_indirect(~virtual(classId, memberId), []);

leading to exactly one level of indirection through two br_tables, always, with no memory wasted on the actual instances to store function table indexes.

Also, if a class does not have any subclasses, like in

function callMe(baz: Baz): void {
  baz.hi();
}

which will be the case quite often, we can look up the #hi function at compile-time and make this a direct call. Overall this is very similar to the ~visit_members runtime functions we have for GC, or how instanceof is implemented.

So, if there are better approaches, that's about the baseline to beat: load -> direct call -> br_table -> br_table -> call_indirect

@MaxGraey
Copy link
Member

MaxGraey commented Sep 16, 2019

I see. Two jump tables. Probably this could have same speed as vtable:
https://www.cipht.net/2017/10/03/are-jump-tables-always-fastest.html

by the way I wonder why compilers codgen linear search for jump tables instead binary search? Which outperform vtables in performance

@MaxGraey
Copy link
Member

MaxGraey commented Sep 16, 2019

Also that about use one switch table instead two?

function ~virtual(classId: i32, memberId: i32): u32 {
  switch ((memberId << 16) | classId) {
     case (1 << 16) | 1: return 0; // Foo#hi
     case (1 << 16) | 2: return 1; // Bar#hi
     case (1 << 16) | 3: return 1; // Bar#hi
  }
  unreachable();
  // the same goes for virtual fields, just that these don't return
  // a function table index, but a memory offset to load from
}

or

function ~virtual(classId: i32, memberId: i32): u32 {
  switch ((<u64>memberId << 32) | <u64>classId) {
     case (<u64>1 << 32) | 1: return 0; // Foo#hi
     case (<u64>1 << 32) | 2: return 1; // Bar#hi
     case (<u64>1 << 32) | 3: return 1; // Bar#hi
  }
  unreachable();
  // the same goes for virtual fields, just that these don't return
  // a function table index, but a memory offset to load from
}

@dcodeIO
Copy link
Member Author

dcodeIO commented Sep 16, 2019

The idea about the double switch is that we can guarantee that memberId is sequential, so we have cases 0, 1, 2, 3 which are guaranteed to become a dense br_table, allowing the VM to pick the best possible machine code for that. We can't guarantee that the classId on the second level is dense, could look like cases 0, 10, 100 (but these are usually fewer than the first level), so Binaryen can pick a br_table or a sequence of br_ifs depending on whether one is optimizing for speed or size.

For a case (1 << 16) | 1: we can't guarantee anything, very likely yielding a sequence of br_ifs.

But actually, I see now why making multiple functions has an advantage (we talked about that above): We can skip the first level switch if we make one function per unique member:

function Foo|Bar|Baz#hi~virtual(classId): u32 {
  switch (classId) {
     case 1: return 0; // Foo#hi
     case 2: return 1; // Bar#hi
     case 3: return 1; // Bar#hi
  }
  unreachable();
}

bringing this down to: load -> direct call -> br_table/br_ifs -> indirect call
but adding a bit of size

@MaxGraey
Copy link
Member

Hmm, I created example with two and single tables and I see binaryen create gen better table for single table one.

Bench results pretty the same (in Chrome & Firefox):

two tables: 20.867919921875ms
one table: 21.202880859375ms

Aslo asm codegen:
virtual1 (two tables):

 0x000000:                              ; 0x000000 from: [0x000007, 0x000010, 0x000019, 0x00001f, 0x000027, 0x000030, 0x000039, 0x00003f, 0x000046, 0x00004e, 0x000057, 0x000060, 0x000066, 0x00006d, 0x000075, 0x00007e, 0x00008c, 0x0000a1, 0x0000ab, 0x0000b0]
  sub rsp, 8                            ; 0x000000 48 83 ec 08
  cmp esi, 1                            ; 0x000004 83 fe 01
  je 0x24                               ; 0x000007 0f 84 17 00 00 00
 0x00000d:                              
  cmp esi, 6                            ; 0x00000d 83 fe 06
  je 0x4b                               ; 0x000010 0f 84 35 00 00 00
 0x000016:                              
  cmp esi, 8                            ; 0x000016 83 fe 08
  jne 0xb0                              ; 0x000019 0f 85 91 00 00 00
 0x00001f:                              
  jmp 0x72                              ; 0x00001f e9 4e 00 00 00
 0x000024:                              
  cmp edi, 1                            ; 0x000024 83 ff 01
  je 0x44                               ; 0x000027 0f 84 17 00 00 00
 0x00002d:                              
  cmp edi, 2                            ; 0x00002d 83 ff 02
  je 0xb5                               ; 0x000030 0f 84 7f 00 00 00
 0x000036:                              
  cmp edi, 3                            ; 0x000036 83 ff 03
  jne 0x4b                              ; 0x000039 0f 85 0c 00 00 00
 0x00003f:                              
  jmp 0xb5                              ; 0x00003f e9 71 00 00 00
 0x000044:                              
  xor eax, eax                          ; 0x000044 33 c0
  jmp 0xba                              ; 0x000046 e9 6f 00 00 00
 0x00004b:                              
  cmp edi, 1                            ; 0x00004b 83 ff 01
  je 0xb5                               ; 0x00004e 0f 84 61 00 00 00
 0x000054:                              
  cmp edi, 2                            ; 0x000054 83 ff 02
  je 0x6b                               ; 0x000057 0f 84 0e 00 00 00
 0x00005d:                              
  cmp edi, 3                            ; 0x00005d 83 ff 03
  jne 0x72                              ; 0x000060 0f 85 0c 00 00 00
 0x000066:                              
  jmp 0xb5                              ; 0x000066 e9 4a 00 00 00
 0x00006b:                              
  xor eax, eax                          ; 0x00006b 33 c0
  jmp 0xba                              ; 0x00006d e9 48 00 00 00
 0x000072:                              
  cmp edi, 1                            ; 0x000072 83 ff 01
  je 0xb5                               ; 0x000075 0f 84 3a 00 00 00
 0x00007b:                              
  cmp edi, 2                            ; 0x00007b 83 ff 02
  je 0x9f                               ; 0x00007e 0f 84 1b 00 00 00
 0x000084:                              
  sub edi, 3                            ; 0x000084 83 ef 03
  mov ecx, edi                          ; 0x000087 8b cf
  cmp ecx, 3                            ; 0x000089 83 f9 03
  jae 0xb0                              ; 0x00008c 0f 83 1e 00 00 00
 0x000092:                              
  movabs rax, 0                         ; 0x000092 48 b8 00 00 00 00 00 00 00 00
  jmp qword ptr [rax + rcx*8]           ; 0x00009c ff 24 c8
 0x00009f:                              
  xor eax, eax                          ; 0x00009f 33 c0
  jmp 0xba                              ; 0x0000a1 e9 14 00 00 00
 0x0000a6:                              
  mov eax, 2                            ; 0x0000a6 b8 02 00 00 00
  jmp 0xba                              ; 0x0000ab e9 0a 00 00 00
 0x0000b0:                              
  jmp 0xfd                              ; 0x0000b0 e9 48 00 00 00
 0x0000b5:                              
  mov eax, 1                            ; 0x0000b5 b8 01 00 00 00
  nop                                   ; 0x0000ba 66 90
  add rsp, 8                            ; 0x0000bc 48 83 c4 08
  ret                                   ; 0x0000c0 c3

and virtual2 (one table):

wasm-function[1]:
 0x000000:                              ; 0x000000 from: [0x00000f, 0x00001b, 0x000027, 0x000033, 0x00003f, 0x00004b, 0x000057, 0x000063, 0x00006f, 0x00007b, 0x000087, 0x00008d, 0x000094, 0x00009b, 0x0000a2, 0x0000ac, 0x0000b1]
  sub rsp, 8                            ; 0x000000 48 83 ec 08
  shl esi, 0x10                         ; 0x000004 c1 e6 10
  or esi, edi                           ; 0x000007 0b f7
  cmp esi, 0x10001                      ; 0x000009 81 fe 01 00 01 00
  je 0x92                               ; 0x00000f 0f 84 7d 00 00 00
 0x000015:                              
  cmp esi, 0x10002                      ; 0x000015 81 fe 02 00 01 00
  je 0xb6                               ; 0x00001b 0f 84 95 00 00 00
 0x000021:                              
  cmp esi, 0x10003                      ; 0x000021 81 fe 03 00 01 00
  je 0xb6                               ; 0x000027 0f 84 89 00 00 00
 0x00002d:                              
  cmp esi, 0x60001                      ; 0x00002d 81 fe 01 00 06 00
  je 0xb6                               ; 0x000033 0f 84 7d 00 00 00
 0x000039:                              
  cmp esi, 0x60002                      ; 0x000039 81 fe 02 00 06 00
  je 0x99                               ; 0x00003f 0f 84 54 00 00 00
 0x000045:                              
  cmp esi, 0x60003                      ; 0x000045 81 fe 03 00 06 00
  je 0xb6                               ; 0x00004b 0f 84 65 00 00 00
 0x000051:                              
  cmp esi, 0x80001                      ; 0x000051 81 fe 01 00 08 00
  je 0xb6                               ; 0x000057 0f 84 59 00 00 00
 0x00005d:                              
  cmp esi, 0x80002                      ; 0x00005d 81 fe 02 00 08 00
  je 0xa0                               ; 0x000063 0f 84 37 00 00 00
 0x000069:                              
  cmp esi, 0x80003                      ; 0x000069 81 fe 03 00 08 00
  je 0xb6                               ; 0x00006f 0f 84 41 00 00 00
 0x000075:                              
  cmp esi, 0x80004                      ; 0x000075 81 fe 04 00 08 00
  je 0xa7                               ; 0x00007b 0f 84 26 00 00 00
 0x000081:                              
  cmp esi, 0x80005                      ; 0x000081 81 fe 05 00 08 00
  jne 0xb1                              ; 0x000087 0f 85 24 00 00 00
 0x00008d:                              
  jmp 0xb6                              ; 0x00008d e9 24 00 00 00
 0x000092:                              
  xor eax, eax                          ; 0x000092 33 c0
  jmp 0xbb                              ; 0x000094 e9 22 00 00 00
 0x000099:                              
  xor eax, eax                          ; 0x000099 33 c0
  jmp 0xbb                              ; 0x00009b e9 1b 00 00 00
 0x0000a0:                              
  xor eax, eax                          ; 0x0000a0 33 c0
  jmp 0xbb                              ; 0x0000a2 e9 14 00 00 00
 0x0000a7:                              
  mov eax, 2                            ; 0x0000a7 b8 02 00 00 00
  jmp 0xbb                              ; 0x0000ac e9 0a 00 00 00
 0x0000b1:                              
  jmp 0xe3                              ; 0x0000b1 e9 2d 00 00 00
 0x0000b6:                              
  mov eax, 1                            ; 0x0000b6 b8 01 00 00 00
  nop                                   ; 0x0000bb 66 90
  add rsp, 8                            ; 0x0000bd 48 83 c4 08
  ret                                   ; 0x0000c1 c3

@dcodeIO
Copy link
Member Author

dcodeIO commented Sep 16, 2019

The Wasm codegen there seems off. Maybe because memberId isn't sequential (0,1,2,3,4,5 instead of 1,6,8,...) or because Binaryen needs more cases to try to make a br_table, hmm.

Unrelated: Also noticed that something is wrong with Wasm Studio in that if one outputs x86 code, asc breaks. Maybe that conflicts with Binaryen versions or something.

@MaxGraey
Copy link
Member

btw great article how this solving in different languages (C++, Java and C#):
https://lukasatkinson.de/2018/interface-dispatch/

@willemneal
Copy link
Contributor

So it seems that they are all in memory tables. I've been working on it and I've currently updated the compiler so that whenever there is a conversion from a concrete type to an interface (e.g. passed into a function that has an interface as the parameter type), the concrete class is added to the interface's list of implementing classes (the class is also checked if it adheres to the interface, meaning that it handles structural interfaces as well). Then at call sites, the call convert into calling ~virtual passing in the signature id and then class id of the concrete object.

Then at the end of compilation each compiled interface method looks up the corresponding function prototypes of the implementing classes and compiles if necessary, adds to the function table, and collects the table entries.

I'm stuck figuring out the reloooper, but am getting close. However, after reading the article I am drawn back to adding in memory vtables and fat pointers seem like the best way to add mixins/traits/ default interface methods.

Also above once the interfaces methods are known, we could reassign them method ids so that they are sequential and then recompile the interface methods using the new ids.

All in all I think that there is some middle ground here. I think we should start with the table as is and then see how it affects performance.

@dcodeIO
Copy link
Member Author

dcodeIO commented Sep 22, 2019

Isn't one important difference to the mechanisms in the above article that we have a classId on each instance (that ultimately determines the vtable) and there's nothing like (<SomeInterface>someObject).someMethod like in Java that can call something different than someObject.someMethod in JS, so we don't even have to think about something like itables?

@willemneal
Copy link
Contributor

I've finally got the relooper working and am close to pushing what I have. It depends on if we want to leave ourselves open to adding new features that JS doesn't support. Also in the example you give, isn't that more of a syntax issue? Since JS is dynamic you could imagine the syntax <SomeInterface>someObject) as passing the instance to an interface function that returns a new object that you could call someMethod on.

@dcodeIO
Copy link
Member Author

dcodeIO commented Sep 22, 2019

What I am trying to get at is that JS with its prototypes is more like those languages at the bottom of that article that "do method lookup by name" and we should use that to our advantage. For instance, the type before (<T>someObject) is irrelevant in our case since we only use the classId anyway. This lets us simplify the virtual lookup in that implementing interfaces becomes merely a compile-time check (just a type, dont' have a classId).

@dcodeIO
Copy link
Member Author

dcodeIO commented Sep 23, 2019

One additional aspect is that we should abstract these internals in a way that one can still call virtual methods from the outside normally. For example, in a case like

export abstract class Foo { // or even an interface
  abstract bar(): i32;
}
class Baz extends Foo {
  bar(): i32 { return 42; }
}
export var foo = new Baz();

one might want to call Foo#bar(foo) externally without knowing that it is a Baz. In this case we'd have to generate a method Foo#bar anyway that then performs the lookup, so the method could as well be the virtual table itself, making this somewhat similar to CRTP again.

function Foo#bar(this): i32 {
  switch (LOAD_RT_ID(this)) {
    case Baz_ID: return Baz#bar(this);
  }
  return unreachable();
}
function Baz#bar(this): i32 {
  return 42;
}

Notice how this doesn't even have to call_indirect. Maybe that's what @MaxGraey meant initially and I just didn't get it yet, if I'm not missing something again.

@willemneal
Copy link
Contributor

My current method in #862 still allows for calling the method directly and passing the instance, but does bury the class lookup in the virtual method. So it's a simple change to move that logic into each method.

@MaxGraey
Copy link
Member

Little bit about de-virtualization in LLVM:
https://llvm.org/devmtg/2018-10/slides/Padlewski-Pszeniczny-Sound%20Devirtualization.pdf

@MaxGraey
Copy link
Member

MaxGraey commented Nov 6, 2019

Different approaches for dynamic dispatching with benchmarks (if-else sequence, binary tree dispatch, switch case and vtable):
https://hal.inria.fr/inria-00072218/document

@willemneal
Copy link
Contributor

fixed by #862

@willemneal
Copy link
Contributor

@dcodeIO @MaxGraey

In #862 I just added a @virtual decorator which allows virtual overloading. In the current test file for method overloading it mentions that in the future all overloading will be virtual.

Is this still the case? If so, it wouldn't be that hard to change now, but I think using the decorator approach allows for more flexibility.

@willemneal
Copy link
Contributor

After some feedback from @MaxGraey I make all overloaded methods virtual by default and as @final decorator to prevent it.

@dcodeIO
Copy link
Member Author

dcodeIO commented Jan 26, 2020

Figured that another issue here is that overloading instance fields/properties is problematic.

interface Foo {
  foo: i32;
}

Both a field and a property implementation meet the criteria as of TS, but in AS we are compiling these differently. Field accesses are just loads and do not retain for efficiency purposes (can assume that there's at least one remaining ref), while property getters do retain as these are functions (can return something dropping to RC=0). Must be taken into account by either enforcing overriding with properties or doing some magic under the hood.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants