Represent non-encodable test argument values in Test.Case.ID #1000

stmontgomery · 2025-03-06T22:32:40Z

This expands Test.Case.ID to represent combinations of arguments which aren't fully encodable, and uniquely distinguishes test cases associated with the same parameterized test function which have otherwise identical IDs.

Motivation:

It is possible to declare a parameterized test function with a collection of arguments that includes the same element more than once. As a trivial example:

@Test(arguments: [1, 1])
func repeatedArg(value: Int) { ... }

This can happen in more realistic scenarios if you dynamically construct a collection of arguments that accidentally or intentionally includes the same value more than once.

The testing library attempts to form a unique identifier for each argument passed to a parameterized test. It does so by checking whether the value conforms to Encodable, or one of the other protocols mentioned in the documentation (see Run selected test cases).

One problem with the current implementation is that if the value doesn't conform to one of those known protocols, the testing library gives up and doesn't produce a unique identifier for that test case at all. Specifically, in that situation the argumentIDs property of Test.Case.ID will have a value of nil.

Another problem is that if an argument is passed more than once, the derived identifiers for each argument's test case will be the same and the result reporting will be ambiguous at best; or worse, the lack of a unique identifier could cause an integrated tool to misbehave or crash. (Current versions of Xcode 16 experience this issue non-deterministically for projects which have test parallelization enabled.) To solve this, there needs to be a way to deterministically distinguish test cases which, from the testing library's perspective, appear identical—either because they actually are the same value, or because they encode to the same representation.

Modifications:

Add a new isStable boolean property to Test.Case.Argument.ID representing whether or not the testing library was able to encode a stable representation of the argument value it identifies.
Add a new SPI property named discriminator to Test.Case which distinguishes test cases associated with the same test function whose arguments are identical.
Add a corresponding property to Test.Case.ID with the same name, discriminator.
Change the Test.Case.ID.argumentIDs property to non-Optional, so that it always has a value even if one or more argument IDs is non-stable.
Add a derived boolean property isStable to Test.Case.ID whose value is true iff all of its argument IDs are stable.
Add Hashable conformance to Test.Case, since the reason for avoiding such conformance has been resolved.
Add new tests.

Checklist:

Code and documentation should follow the style of the Style Guide.
If public symbols are renamed or modified, DocC references should be updated.

Resolves #995
Resolves rdar://119522099

Resolves swiftlang#995 Resolves rdar://119522099

…zed test functions by making arguments and discriminator optional

Sources/Testing/ABI/Encoded/ABI.EncodedTest.swift

Sources/Testing/Events/Recorder/Event.HumanReadableOutputRecorder.swift

Sources/Testing/Parameterization/CustomTestArgumentEncodable.swift

Sources/Testing/Parameterization/Test.Case.Generator.swift

Sources/Testing/Parameterization/Test.Case.ID.swift

Sources/Testing/Parameterization/Test.Case.swift

grynspan · 2025-03-11T14:34:33Z

Sources/Testing/Parameterization/Test.Case.Generator.swift

+        let testCaseID = testCase.id
+
+        // Ensure test cases with identical IDs each have a unique discriminator.
+        let discriminator = state.testCaseIDs[testCaseID, default: 0]


What's the benefit of maintaining a dictionary here? What if the discriminator were just the enumerated() index of the case? Yes, we'd end up skipping over some discriminators, but so what—they'll still be unique in this context, and it'll be much cheaper to compute and store them. [P2]

The goal is different than you may be thinking: I'm not trying to assign each test case the next-highest integer as the discriminator, which would be equivalent to the "enumeration index" of the collection; instead I'm trying to locate the test cases for each parameterized test function which have identical arguments, and for each such subset, assign each of those test cases a unique discriminator. Consider this example:

@Test(arguments: [9, 9, 8, 7] func example(_: Int) {}

This will have 4 test cases, and with this PR's changes, they should be (in pseudo-code):

[ Test.Case(arguments: [9], discriminator: 0), Test.Case(arguments: [9], discriminator: 1), Test.Case(arguments: [8], discriminator: 0), // back to 0 here! Test.Case(arguments: [7], discriminator: 0), // and here! ]

The end goal is to be able to detect duplicates of the same set of arguments. If we used enumerated() on the original collection that would be harder for clients to detect duplicates, and for the testing library itself to detect and report warnings for them (which I intend to do in a follow-on PR). I also worry that would tempt clients to abuse the discriminator value and interpret it as an index of the original collection, and that's explicitly not what I'm intending—particularly because the order may differ between successive runs (for example for hashed collections).

I see—is there something we can do that's less heavyweight than a hash table?

Not sure. The use of a dictionary doesn't seem that heavyweight to me in context. This isn't now causing values to be encoded where they wouldn't otherwise be, if that's the concern.

One small optimization I just realized I could make, though, is to specify a capacity for the dictionary based on the generator's underestimatedCount.

A concern I have is that this is going to scale at least linearly with the number of test cases in a test. As of right now, we try to avoid keeping information about all test cases in memory during a run. If the inputs are an artisanally crafted array of values, then that's one thing. But if there's a dynamically generated sequence of inputs that's particularly long, we're now going to pay a memory cost to track all of them.

The dictionary only stores their IDs, not the test cases themselves, and I don't expect that to be onerous. I definitely share your view that we should not be holding on to test case instances themselves for the duration of the test run, or even for the lifetime of a Test.Case.Generator.

We already "accumulate" the IDs of Test instances in places like Event.HumanReadableOutputRecorder to keep track of events per-test. I imagine in the not-too-distant future we'll want to extend that pattern to test cases too, so that (for example) when running in verbose mode, our console output could log when test cases finish and how many failures each one had. That topic came up in #943, and was only partly solved by #972. One of my motivations for this PR is to allow us to properly implement that. I've been meaning to file a new issue to track that, so I just filed #1021.

grynspan

Mostly minor feedback, but a couple of points we ought to clarify/address.

stmontgomery · 2025-03-11T21:05:37Z

@swift-ci please test

stmontgomery · 2025-03-12T15:19:54Z

@swift-ci please test Linux

Sources/Testing/Parameterization/Test.Case.swift

…on each Test.Case.Argument, and reducing manual Codable implementations from 3 down to 1. Add more exhaustive serialization tests

…th older versions of the testing library, validated by more tests

stmontgomery · 2025-03-14T19:02:09Z

@swift-ci please test

…th status and issue counts (#1125) This outputs a console message when each test case of a parameterized test function ends, including its pass/fail status and a count of the number of issues which were recorded, when running in verbose mode (`swift test --verbose`). Here's an example: ``` ◇ Test contrivedExample(x:) started. ◇ Test case passing 1 argument x → 1 (Swift.Int) to contrivedExample(x:) started. ◇ Test case passing 1 argument x → 2 (Swift.Int) to contrivedExample(x:) started. ✔ Test case passing 1 argument x → 1 (Swift.Int) to contrivedExample(x:) passed after 0.001 seconds. ✘ Test contrivedExample(x:) recorded an issue with 1 argument x → 2 at EventRecorderTests.swift:759:3: Expectation failed: (x → 2) == 1 ↳ x: Swift.Int → 2 ↳ 1: Swift.Int → 1 ✘ Test case passing 1 argument x → 2 (Swift.Int) to contrivedExample(x:) failed after 0.001 seconds with 1 issue. ✘ Test contrivedExample(x:) with 2 test cases failed after 0.001 seconds with 1 issue. ``` > Note: This leverages #1000 which added more robust identification of test cases. Fixes #1021 Fixes rdar://146863942 ### Checklist: - [x] Code and documentation should follow the style of the [Style Guide](https://github.com/apple/swift-testing/blob/main/Documentation/StyleGuide.md). - [x] If public symbols are renamed or modified, DocC references should be updated.

Represent non-encodable test argument values in Test.Case.ID

616dd87

Resolves swiftlang#995 Resolves rdar://119522099

stmontgomery added enhancement New feature or request tools integration 🛠️ Integration of swift-testing into tools/IDEs parameterized-testing Related to parameterized testing functionality labels Mar 6, 2025

stmontgomery added this to the Swift 6.x milestone Mar 6, 2025

stmontgomery self-assigned this Mar 6, 2025

stmontgomery requested review from grynspan, briancroom, SeanROlszewski and suzannaratcliff as code owners March 6, 2025 22:32

briancroom approved these changes Mar 7, 2025

View reviewed changes

Improve the modeling of test cases (and their IDs) for non-parameteri…

ae2d351

…zed test functions by making arguments and discriminator optional

stmontgomery requested a review from briancroom March 10, 2025 22:29

briancroom approved these changes Mar 10, 2025

View reviewed changes