Skip to content

Calculate spare bits for multi-payload enums from first principles #73491

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 10, 2024

Conversation

tbkka
Copy link
Contributor

@tbkka tbkka commented May 7, 2024

This adds a getSpareBits method to all the TypeInfo classes
that returns a suitable bitmask indicating the spare bits available
in values of this type. This gives us a way to recursively explore
the type tree and build up a full spare bit mask for an arbitrary type.

Happily, it appears we actually do have enough information to do this
calculation entirely from first principles, without requiring additional
reflection information. So once this is stable, we should remove my
earlier incomplete effort to publish spare bit mask info in the reflection
data, as that adds unnecessary metadata to every binary.

This doubtless still has plenty of holes, but seems sufficient
to handle a few basic enum types, including the stdlib DecodingError
which was used as an example to work out some key issues.

Resolves rdar://126563813

tbkka added 3 commits April 29, 2024 16:50
This adds a `getSpareBits` method to all the TypeInfo classes
that returns a suitable bitmask indicating the spare bits available
in values of this type.  This gives us a way to recursively explore
the type tree and build up a full spare bit mask for an arbitrary type.

Happily, it appears we actually do have enough information to do this
calculation entirely from first principles, without requiring additional
reflection information.  So once this is stable, we should remove my
earlier incomplete effort to publish spare bit mask info in the reflection
data, as that adds unnecessary metadata to every binary.

This doubtless still has plenty of holes, but seems sufficient
to handle a few basic enum types, including the stdlib DecodingError
which was used as an example to work out some key issues.
@tbkka tbkka requested a review from a team as a code owner May 7, 2024 21:22
@tbkka tbkka requested a review from kastiglione May 7, 2024 21:23
@slavapestov
Copy link
Contributor

slavapestov commented May 7, 2024

One difficulty you might run into is that this depends on resilience boundaries. Eg,

public struct S {
  var x: AnyObject
}

public enum E {
  case first(S)
  case second(S)
}

The layout of E will use spare bits if and only if E is resilient or S is not resilient. That is, we have to consider whether any potential client of E "knows" what the spare bits of S are:

  • S resilient, E not resilient: no spare bits
  • S resilient, E resilient: spare bits ok
  • S not resilient, E resilient: spare bits ok
  • S not resilient, E not resilient: spare bits ok

This doesn't come up with extra inhabitants, because those are tracked dynamically.

@tbkka
Copy link
Contributor Author

tbkka commented May 7, 2024

One difficulty you might run into is that this depends on resilience boundaries.

Thanks for pointing that out. If necessary, we can try adding some resilience markers to the type and/or reflection metadata (which should be cheaper than putting the entire spare bit mask in the metadata).

@tbkka
Copy link
Contributor Author

tbkka commented May 8, 2024

@swift-ci Please test

@tbkka
Copy link
Contributor Author

tbkka commented May 8, 2024

@slavapestov Here's a related question: Do we ever under-utilize an added tag field? That is, if I have 100 cases will the compiler ever choose to add a tag byte and then only partially use the tag byte by depending on spare bits as well? It seems reasonable that once you've paid the memory cost of adding a tag byte/word/etc then you would take advantage of it to reduce code size and improve performance. If so, then RemoteInspection can assume tag-only semantics whenever it sees the compiler has added a suitably-large tag. Which would in turn partially address your concern above.

@tbkka
Copy link
Contributor Author

tbkka commented May 8, 2024

Answer: Yes, we do under-utilize the tag field. We always prefer spare bits and spill over to the tag field only as many bits as needed.

So eventually, we'll need to find a bit somewhere to stash whether or not a particular enum uses spare bits at all. But we should still be able to calculate the actual mask directly in RemoteMirror. We can still rip out my old experiment to store the full compiler-computed spare bit mask in the reflection data, which would be good.

@tbkka
Copy link
Contributor Author

tbkka commented May 9, 2024

@swift-ci Please test

@kastiglione
Copy link
Contributor

kastiglione commented May 9, 2024

This works for the use case I have run into:

(lldb) p error
(DecodingError) error = keyNotFound {
  keyNotFound = {
    0 = num
    1 = {
      codingPath = 0 values {}
      debugDescription = "No value associated with key CodingKeys(stringValue: \"num\", intValue: nil) (\"num\")."
      underlyingError = nil
    }
  }
}

@tbkka tbkka merged commit 03ea6a1 into swiftlang:main May 10, 2024
5 checks passed
@tbkka
Copy link
Contributor Author

tbkka commented May 10, 2024

@kastiglione I'll doubtless need your help to test this further and identify cases that are still unsupported. @slavapestov pointed out one big hole above that will impact certain resilient types. Otherwise, I believe that it should be pretty routine to address any missing cases in this code.

tbkka added a commit to tbkka/swift that referenced this pull request May 14, 2024
Calculate spare bits for multi-payload enums from first principles

This adds a getSpareBits method to all the TypeInfo classes
that returns a suitable bitmask indicating the spare bits available
in values of this type. This gives us a way to recursively explore
the type tree and build up a full spare bit mask for an arbitrary type.

We mostly have enough information to do this calculation entirely from first
principles, without requiring additional reflection information. So once this is
stable, we should remove my earlier incomplete effort to publish spare bit mask
info in the reflection data, as that adds unnecessary metadata to every binary.

TODO: Resilience forces some enums to not use spare bits even though
they otherwise would be able to.  We should have the compiler add a single
bit to the reflectio data indicating whether or not spare bits were used
and then rely on this code to actually compute the spare bits.

This doubtless still has plenty of holes, but seems sufficient
to handle a few basic enum types, including the stdlib DecodingError
which was used as an example to work out some key issues.

Resolves rdar://126563813
@tbkka
Copy link
Contributor Author

tbkka commented Jun 4, 2024

One difficulty you might run into is that this depends on resilience boundaries.

A-ha! Fortunately, such cases don't emit FixedTypeDescriptors, and we already have logic to map any MPE that lacks a FixedTypeDescriptor to the tag-only layout strategy. So we actually don't need any new logic to handle that correctly. (I do now have a test case, though. I'll certainly add that.)

tbkka added a commit to tbkka/swift that referenced this pull request Jun 5, 2024
…s also switches the

MPE layout code to exclusively use the new code.  The key observation: existing
reflection metadata seems to already provide enough information in all cases, so
we can abandon an earlier effort to add spare bitmask data.

Resolves rdar://129281368
tbkka added a commit that referenced this pull request Jun 6, 2024
Expand the work from #73491 to support more MPE layouts.
tbkka added a commit to tbkka/swift that referenced this pull request Jun 6, 2024
…fixes

Expand the work from swiftlang#73491 to support more MPE layouts.

This is also switches the MPE layout code to exclusively use the new approach. The key observation: existing reflection metadata seems to already provide enough information in all cases, so we can abandon an earlier effort to add spare bitmask data.

There is some risk that the old code may have handled some enums that aren't (yet) fully supported by the new code. However, I did a bunch of experiments and found that our existing test cases have decent coverage of the major capabilities and I've also added a new test case specifically to exercise enum layouts being accessed resiliently. So if there are regressions from this change, they should be minor and easily fixed.

Resolves rdar://129281368
ktoso pushed a commit to ktoso/swift that referenced this pull request Jun 14, 2024
…fixes

Expand the work from swiftlang#73491 to support more MPE layouts.

This is also switches the MPE layout code to exclusively use the new approach. The key observation: existing reflection metadata seems to already provide enough information in all cases, so we can abandon an earlier effort to add spare bitmask data.

There is some risk that the old code may have handled some enums that aren't (yet) fully supported by the new code. However, I did a bunch of experiments and found that our existing test cases have decent coverage of the major capabilities and I've also added a new test case specifically to exercise enum layouts being accessed resiliently. So if there are regressions from this change, they should be minor and easily fixed.

Resolves rdar://129281368
@tbkka tbkka deleted the tbkka-mpe-sparebits branch August 1, 2024 16:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants