-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[DRAFT][ownership] Test stdlib with ownership + Andy patches #35942
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DRAFT][ownership] Test stdlib with ownership + Andy patches #35942
Conversation
… ossa/non-ossa SIL. In SILCombine, we do not want to add or delete edges. We are ok with swapping edges or replacing edges when the CFG structure is preserved. This becomes an issue since by performing this optimization, we are going to get rid of the error parameter but leave a try_apply, breaking SIL invariants. So to do perform this optimization, we would need to convert to an apply and eliminate the error edge, breaking the aforementioned SILCombine invariant. So, just do not perform this for now and leave it to other passes like SimplifyCFG.
…ths where there is dynamically no value by inserting compensating destroys. This commit is fixing two things: 1. In certain cases, we are seeing cases where either SILGen or the optimizer are eliminating destroy_addr along paths where we know that an enum is dynamically trivial. This can not be expressed in OSSA, so I added code to pred-deadalloc-elim so that I check if any of our available values after we finish promoting away an allocation now need to have their consuming use set completed. 2. That led me to discover that in certain cases load [take] that we were promoting were available values of other load [take]. This means that we have a memory safety issue if we promote one load before the other. Consider the following SIL: ``` %mem = alloc_stack store %arg to [init] %mem %0 = load [take] %mem store %0 to [init] %mem %1 = load [take] %mem destroy_value %1 dealloc_stack %mem ``` In this case, if we eliminate %0 before we eliminate %1, we will have a stale pointer to %0. I also took this as an opportunity to turn off predictable mem access opt on SIL that was deserialized canonicalized and non-OSSA SIL. We evidently need to still do this for pred mem opts for perf reasons (not sure why). But I am pretty sure this isn't needed and allows me to avoid some nasty code.
…complete available value when cleaning up takes.
While looking at the performance of the verifier running with -sil-verify-all on the stdlib, I noticed that we are spending ~30% of the total time in the verifier performing this check. Introducing the cache mitigates this issue. I believe the reason is that we were walking for each operand the use list of its associated value which I think is quadratic.
This PR just removes an unnecessary error raised in OwnershipLifetimeExtender::createPlusOneCopy. We can ownership rauw a value inside the loop with a value outside the loop. findJointPostDominatingSet correctly helps create control equivalent copies inside the loop for replacement.
…br that do not involve objects directly. Just to reduce the size of the CFG.
…dress through the phi using a RawPointer. In OSSA, we do not allow for address phis, but in certain cases the logic of LoopRotate really wants them. To work around this issue, I added some code in this PR to loop rotate that as a post-pass fixes up any address phis by inserting address <-> raw pointer adapters and changing the address phi to instead be of raw pointer type.
…that have at least one enum tuple sub-elt. Just until MemoryLifetime can handle enums completely.
…passes. This eliminates some regressions by eliminating phase ordering in between ARCSequenceOpts/inlining with read only functions whose read onlyness is lost after inlining.
@swift-ci test |
@swift-ci test |
@swift-ci benchmark |
Build failed |
Build failed |
Performance: -O
Code size: -O
Performance: -Osize
Code size: -Osize
Performance: -Onone
Code size: -swiftlibs
How to read the dataThe tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.If you see any unexpected regressions, you should consider fixing the Noise: Sometimes the performance results (not code size!) contain false Hardware Overview
|
Just for testing