Skip to content

fix: avoid conflict between uppercase/lowercase enum values, and ignore duplicate values #1095

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

eli-bl
Copy link
Collaborator

@eli-bl eli-bl commented Aug 13, 2024

fixes #1030

This PR allows the client generator to work with an enum that has values differing only by case - for instance, ["A", "a", "B", "b"]. Previously, such a value list would have caused an error like "Duplicate key A in Enum".

It also allows the client generator to work if two enum values are exactly the same - for instance, ["A", "B", "A"]. Previously this would have also caused a "Duplicate key" error.

Background

Enum values are case-sensitive in OpenAPI; the OpenAPI spec itself is silent on that issue, but it inherits its enum behavior from JSON Schema, where the values are definitely case-sensitive (also, I don't know of any language in which the default behavior for comparing JSON values would be case-insensitive). So the generator should support this use case.

The generator currently generates names for enum values that follow the standard Python constant naming convention: uppercase and underscores. Thus, if the allowable values for an enum MyType are ["A", "B", and "otherValue"], the constant names will be MyType.A, MyType.B, and MyType.OTHER_VALUE. However, if another allowable value is "a", the generated constant name would also be MyType.A so there would be a conflict.

The same name conflict problem would result if two values really are equal (case-sensitively)... which, as far as I can tell, is not invalid in OpenAPI. Since OpenAPI is geared toward validation, there should be no problem (besides redundancy) with saying that a value must match one of ["A", "B", "A"]. The problem only arises when we try to generate multiple Python constant names for equal values... which there is really no reason to do, since there is no semantic difference between the two "A"s in this example.

Solution

  1. If the list of enum values does not contain any string values that are case-insensitively equal to each other, the behavior is unchanged.

If it does contain any such values, then the behavior changes as follows: the constant name is simply the value with the minimal necessary changes to make it a valid Python symbol. That is, spaces are changed to underscores, all other disallowed characters are removed, and (as before) a "VALUE_" prefix is added if it didn't start with an alphabetic character. For the value list ["A", "a", "B", "otherValue"], the constant names will be MyType.A, MyType.a, MyType.B, and MyType.otherValue.

  1. If the list of enum values contains any exact duplicates, only one constant will be generated for each unique value. For the value list ["A", "B", "A"], the constant names will be MyType.A and MyType.B.

Limitations

There are still valid value strings that could cause a conflict in this implementation, due to the sanitization behavior. For instance, the values "two words" and "two_words" would conflict; so would "abc" and "a$$$$bc". A universally valid implementation would need to come up with a predictable name-escaping behavior that wouldn't collide for any unique input values. However, I think it would be undesirable for the default behavior to produce something like "two_x20words", because it's quite common for enum values to contain spaces and/or underscores, and developers would be annoyed by having to type escape sequences for these in their code. One workaround might be to add a config option that would guarantee valid and unique constant names for every possible value, at the expense of readability.

Compatibility

These changes only change the behavior for (valid) specs that the generator previously could not handle. Therefore, any projects that were successfully using the generator will see no difference.

@eli-bl eli-bl force-pushed the issue-1030-enum-case branch from 43a8e48 to 85ef4d7 Compare August 13, 2024 21:42
@eli-bl eli-bl changed the title avoid conflict between uppercase/lowercase enum values fix: avoid conflict between uppercase/lowercase enum values Aug 13, 2024
@eli-bl eli-bl changed the title fix: avoid conflict between uppercase/lowercase enum values fix: avoid conflict between uppercase/lowercase enum values, and ignore duplicate values Aug 13, 2024
@eli-bl
Copy link
Collaborator Author

eli-bl commented Aug 21, 2024

I missed that https://github.com/openapi-generators/openapi-python-client/pull/725/files already existed for the same issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Duplicate key in Enum when using both lowercase and uppercase strings
1 participant