Skip to content
This repository was archived by the owner on Nov 3, 2021. It is now read-only.

Simplify flags #40

Closed
rossberg opened this issue Nov 20, 2018 · 8 comments
Closed

Simplify flags #40

rossberg opened this issue Nov 20, 2018 · 8 comments

Comments

@rossberg
Copy link
Member

The proposal currently uses two flag bits for segments, one to distinguish active from passive, and another one for making the offset optional.

The second flag is unnecessary and I suggest removing it. It suffices to gate the absence or presence of the offset with the active/passive distinction. There is no need to retroactively add the ability to drop the offset for active segments. All this allows is saving one byte, for a construct that typically is both rare and whose size is by far dominated by its actual data contents. That doesn't justify the extra complexity of another flag.

@binji
Copy link
Member

binji commented Nov 20, 2018

The intention is that one flag bit (1) distinguishes active from passive, and another bit (2) indicates whether the memory or table index is present. So the following flag states are valid:

0: active segment, offset, no index (MVP)
1: passive segment, no offset, no index
2: active segment, offset, index present

3 is not valid, as a passive segment never requires an index.

We don't have to define flag 2 in this proposal, but we at least need to coordinate this with the reference types proposal.

@rossberg
Copy link
Member Author

Ah, sorry, my bad, I was reading it the wrong way round. Still, I wonder whether we should really upgrade active segments or rather just keep them as a legacy wart and limit all new functionality to passive segments.

@binji
Copy link
Member

binji commented Nov 20, 2018

Maybe so, though @lukewagner mentioned some reasons here why we'd still end up producing them from the toolchain.

I also discussed this with @sbc100 and @jgravelle-google a bit, and we could think of some cases where it might be nice to use multiple tables to handle separate function signatures, or to handle the case in C where a function pointer is converted to the wrong signature and called. AIUI this is UB, but happens in Python, and could be handled in a (somewhat) elegant way using multiple tables.

You can do all of this with passive tables too, of course, but it might be simpler (and perhaps smaller) to do it with an active table.

@rossberg
Copy link
Member Author

Agreed with the use case for multiple tables, but as you say, passive segments are fine for that. If we were to design segments today, I doubt we would include active segments in the design. So my argument is that they should not be optimised for anything but backwards compatibility.

I don't feel too strongly on this particular instance. However, I fear that it'll set a precedent for backporting all new functionality to active segments in the future, thus duplicating everything. Can we agree not to do that?

@binji
Copy link
Member

binji commented Nov 27, 2018

I'm OK with that, but reading @lukewagner's comment makes me think he might think otherwise. Luke, wdyt?

@conrad-watt
Copy link
Contributor

conrad-watt commented Nov 27, 2018

The current encoding seems to position active segments as a more efficient encoding for common patterns involving funcref tables. The presence of the active segment bit is taken as a sign that the elements will be given as function indexes. I think it makes sense to give active segments enough expressiveness to handle multiple tables of funcrefs.

I actually interpret this encoding as taking a minor position against backporting, because allowing active segments to fill other types of references would now require adding another flag case down the line.

If we cared about allowing active segments the full expressiveness of passive segments + desugaring, would it makes sense to conceptually have three bits?

bit 0: active/passive - active implies an offset in the encoding
bit 1: table implicit zero-index/explicit index
bit 2: func index representation/ref representation - ref representation implies elem_type in the encoding

so 000 would denote an active segment, described by a list of function indices, filling table 0
110 would denote an active segment, described by ref expressions, filling an explicitly indexed table
001 would allow a passive segment to be represented using function indices
x11 would be invalid binary representation because passive segments shouldn't have an index

If we expect to add this functionality to active segments down the line, maybe this encoding makes sense. Extending the current encoding down the line doesn't allow us to consistently interpret the third bit (i.e. it's always "0" right now, but passive segments get a full ref representation, while active segments get an index representation).

EDIT: I should clarify I'm happy with the encoding as it is now, as long as there's no expectation that active segments will ever be filling a non-funcref table.

@rossberg
Copy link
Member Author

Well, yes, that is exactly the messy direction I want to avoid.

@binji
Copy link
Member

binji commented Apr 5, 2019

It sounds like we're OK sticking with the 3 states described above:

0: active segment, offset, no index (MVP)
1: passive segment, no offset, no index
2: active segment, offset, index present

@binji binji closed this as completed Apr 5, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants