-
Notifications
You must be signed in to change notification settings - Fork 10.5k
Nondeterministic Release Builds #77168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
here is a reduction based on the YAMS project (i'm guessing it could be further reduced). if you repeatedly run something like swiftc -O <file> -o <output> && md5 <output> eventually the checksum changes b/c the vector instructions' ordering are flipped. empirically it seems like it takes b/w 10-20 tries before this happens. certainly seems like there is some nondeterminism with whatever produces the public enum Node: Hashable {
case mapping(Mapping)
}
// MARK: - Mapping
extension Node {
public struct Mapping {
private var pairs: [(Node, Node)]
}
}
extension Node.Mapping: Comparable {
public static func < (lhs: Node.Mapping, rhs: Node.Mapping) -> Bool {
fatalError()
}
}
extension Node.Mapping: MutableCollection {
public typealias Element = (key: Node, value: Node)
// MARK: Sequence
public func makeIterator() -> Array<Element>.Iterator {
fatalError()
}
// MARK: Collection
public typealias Index = Array<Element>.Index
public var startIndex: Index {
fatalError()
}
public var endIndex: Index {
fatalError()
}
public func index(after index: Index) -> Index {
return pairs.index(after: index)
}
public subscript(index: Index) -> Element {
get { fatalError() }
set { fatalError() }
}
}
extension Node.Mapping: Equatable {
public static func == (lhs: Node.Mapping, rhs: Node.Mapping) -> Bool {
fatalError()
}
}
extension Node.Mapping: Hashable {
public func hash(into hasher: inout Hasher) {
fatalError()
}
} |
Thanks for the reduced example file @jamieQ ! I was able to bisect the issue a bit and it seems to be caused by the SIL optimization step. Using
Can take a deeper look at the stage later, but wanted to drop some findings. |
i think this may not exactly be a swift issue, but rather an llvm one. i've yet to see the non-determinism manifest in the output of any swift compiler stage before swiftc -O \
-emit-assembly \
-Xllvm -print-after-all \
-Xllvm -filter-print-funcs='$sSlsE5index_8offsetBy07limitedC05IndexQzSgAE_SiAEtF4main4NodeO7MappingV_Tgq5Tf4nnnd_n' \
<file> -o <output> 2&> llvm-print-all.log && md5 <output> diffing the outputs of the llvm logs produced something like this (this is where the outputs first differ): --- llvm-lowering.correct.log
+++ llvm-lowering.flipped.log
<snip>
*** IR Dump After Complex Deinterleaving Pass (complex-deinterleaving) ***
define linkonce_odr hidden swiftcc { i64, i8 } @"$sSlsE5index_8offsetBy07limitedC05IndexQzSgAE_SiAEtF4main4NodeO7MappingV_Tgq5Tf4nnnd_n"(i64 %0, i64 %1, i64 %2) local_unnamed_addr #2 {
%4 = icmp slt i64 %1, 0
br i1 %4, label %67, label %5, !prof !17
5: ; preds = %3
%6 = icmp eq i64 %1, 0
br i1 %6, label %61, label %7
7: ; preds = %5
%8 = sub i64 %2, %0
%9 = sub i64 9223372036854775807, %0
%10 = add i64 %1, -1
%11 = tail call i64 @llvm.umin.i64(i64 %8, i64 %10)
%12 = tail call i64 @llvm.umin.i64(i64 %11, i64 %9)
%13 = add i64 %12, 1
%14 = icmp ult i64 %13, 5
br i1 %14, label %35, label %15
15: ; preds = %7
%16 = and i64 %13, 3
%17 = icmp eq i64 %16, 0
%18 = select i1 %17, i64 4, i64 %16
%19 = sub i64 %13, %18
%20 = insertelement <2 x i64> <i64 poison, i64 0>, i64 %0, i64 0
- %21 = call <4 x i64> @llvm.experimental.vector.interleave2.v4i64(<2 x i64> %20, <2 x i64> zeroinitializer)
+ %21 = call <4 x i64> @llvm.experimental.vector.interleave2.v4i64(<2 x i64> zeroinitializer, <2 x i64> %20)
br label %22
22: ; preds = %22, %15
%23 = phi i64 [ %27, %22 ], [ %19, %15 ]
%24 = phi <4 x i64> [ %21, %15 ], [ %26, %22 ]
%25 = call <4 x i64> @llvm.experimental.vector.interleave2.v4i64(<2 x i64> <i64 1, i64 1>, <2 x i64> <i64 1, i64 1>)
%26 = add <4 x i64> %24, %25
%27 = add i64 %23, -4
%28 = icmp eq i64 %27, 0
br i1 %28, label %29, label %22, !llvm.loop !19
29: ; preds = %22
%30 = call { <2 x i64>, <2 x i64> } @llvm.experimental.vector.deinterleave2.v4i64(<4 x i64> %26)
%31 = extractvalue { <2 x i64>, <2 x i64> } %30, 0
%32 = extractvalue { <2 x i64>, <2 x i64> } %30, 1
- %33 = add <2 x i64> %32, %31
+ %33 = add <2 x i64> %31, %32
%34 = tail call i64 @llvm.vector.reduce.add.v2i64(<2 x i64> %33)
br label %35
<snip> which closely aligns with the original diff from the object file output as well as the assembly output. furthermore, if you compile the code and explicitly disable that pass (compile with so perhaps hunting for something non-deterministic within the logic for that pass is a good next place to investigate. additionally, disabling that pass explicitly for the affected code may serve as a workaround, though i'm unsure if there are any downsides associated with doing that. update: it appears that there was a commit added last year to address non-determinism in that pass. not being super familiar with the domain logic there, it's unclear to me if that would definitively explain the observed issue here, but it certainly seems suggestive. the version of llvm used in the Swift 6 branch excludes this change, but the llvm-project target was recently bumped to a more recent version, so i would expect any nightly builds off main after that point to include the llvm change (and the Swift 6.1 branch should presumably also include it when it's created per the outlined branching process). i haven't personally tested a recent nightly to verify it resolves this issue, but that would presumably be the next thing to do. |
had a chance to try and verify this theory on some of the development toolchains and it appears that in toolchains built after the LLVM rebranch which includes the suspected fix from LLVM main, the issue is resolved. repeatedly building & checking the md5 of the assembly output on the 10-27 snapshot (before the LLVM bump) consistently reproduces the issue, but using the 10-30 snapshot (after the LLVM bump) does not, even after several hundred compilations. |
Same here, we've been verifying against the latest Swift snapshot (and disabling complex deinterleaving) on our repo and haven't seen anything crop up 🤞 |
We are observing this when compiling swift-syntax. ![]()
|
Description
Using Xcode 16 and Swift version 6.0 (
swiftlang-6.0.0.9.10 clang-1600.0.26.2
) I'm seeing some variance in the compiled.o
object files with optimized/release builds. It seems to happen consistently in the standard libraryindex(_:offsetBy:limitedBy:)
function. Usingotool -tV
on 2 object files I see this diff for the binary I'm building withswift build -c release
for the project: https://github.com/jpsim/YamsI see a similar diff when compiling SwiftSyntax for release on the same disassembled
index(_:offsetBy:limitedBy:)
function.Reproduction
I'm still trying to come up with a smaller repro project, but here are the steps I ran locally to reproduce using the Yams project (https://github.com/jpsim/Yams).
As the issue is nondeterministic, it can take a couple rebuilds as commented below to repro.
Expected behavior
Built object files are deterministic and reproducible in repeated builds with the same inputs.
Environment
swift-driver version: 1.115 Apple Swift version 6.0 (swiftlang-6.0.0.9.10 clang-1600.0.26.2)
Target: arm64-apple-macosx14.0
Additional information
No response
The text was updated successfully, but these errors were encountered: