fix: avoid conflict between uppercase/lowercase enum values, and ignore duplicate values #1095
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
fixes #1030
This PR allows the client generator to work with an enum that has values differing only by case - for instance, ["A", "a", "B", "b"]. Previously, such a value list would have caused an error like "Duplicate key A in Enum".
It also allows the client generator to work if two enum values are exactly the same - for instance, ["A", "B", "A"]. Previously this would have also caused a "Duplicate key" error.
Background
Enum values are case-sensitive in OpenAPI; the OpenAPI spec itself is silent on that issue, but it inherits its enum behavior from JSON Schema, where the values are definitely case-sensitive (also, I don't know of any language in which the default behavior for comparing JSON values would be case-insensitive). So the generator should support this use case.
The generator currently generates names for enum values that follow the standard Python constant naming convention: uppercase and underscores. Thus, if the allowable values for an enum MyType are ["A", "B", and "otherValue"], the constant names will be MyType.A, MyType.B, and MyType.OTHER_VALUE. However, if another allowable value is "a", the generated constant name would also be MyType.A so there would be a conflict.
The same name conflict problem would result if two values really are equal (case-sensitively)... which, as far as I can tell, is not invalid in OpenAPI. Since OpenAPI is geared toward validation, there should be no problem (besides redundancy) with saying that a value must match one of ["A", "B", "A"]. The problem only arises when we try to generate multiple Python constant names for equal values... which there is really no reason to do, since there is no semantic difference between the two "A"s in this example.
Solution
If it does contain any such values, then the behavior changes as follows: the constant name is simply the value with the minimal necessary changes to make it a valid Python symbol. That is, spaces are changed to underscores, all other disallowed characters are removed, and (as before) a "VALUE_" prefix is added if it didn't start with an alphabetic character. For the value list ["A", "a", "B", "otherValue"], the constant names will be MyType.A, MyType.a, MyType.B, and MyType.otherValue.
Limitations
There are still valid value strings that could cause a conflict in this implementation, due to the sanitization behavior. For instance, the values "two words" and "two_words" would conflict; so would "abc" and "a$$$$bc". A universally valid implementation would need to come up with a predictable name-escaping behavior that wouldn't collide for any unique input values. However, I think it would be undesirable for the default behavior to produce something like "two_x20words", because it's quite common for enum values to contain spaces and/or underscores, and developers would be annoyed by having to type escape sequences for these in their code. One workaround might be to add a config option that would guarantee valid and unique constant names for every possible value, at the expense of readability.
Compatibility
These changes only change the behavior for (valid) specs that the generator previously could not handle. Therefore, any projects that were successfully using the generator will see no difference.