-
Notifications
You must be signed in to change notification settings - Fork 10.3k
Add support for reference-based schemas in OpenAPI document #56175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
return false; | ||
} | ||
|
||
return GetHashCode(x) == GetHashCode(y); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The two objects might return true but not be the same if equals is implemented like this. Won't that be a problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Say more? It should function as an equality operator as long as it's symmetric, reflexive, and transitive, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the typical expectation is that types with the same hash codes might be equal whereas types with different hash codes are definitely not equal.
Assuming I've done my implementation correctly, GetHashCode
for OpenApiSchema
and should return a unique value for each equivalent schema object (e.g. the bucket size is always 1).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When working with hashcodes I don't think you can guarantee the bucket size is 1 given the nature of the objects. The data in the object is being reduced down to a 2^32 number. There are infinite possible ways of defining schemas that are reduced to a hashcode so it seems like at some point there must be hash collisions.
I question what is happening here - using hash code as equals goes against guidance - but on the other hand you're confident that's its ok. I don't feel certain enough in my understanding of hash tables to be sure. Maybe ping an expert in this area?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your questions here are legit. After some rubber ducking, it appears there is a gap here when evaluating OpenApiByte
and OpenApiBinary
types since they encapsulate a byte area that we were evaluating correctly. I did some hardening on the comparer implementations in 623fa19
I agree about not duplicating schemas for different collection types. Did you consider having some logic that knows about collections and ensures that JSON array types are shared, e.g. anything that implements
That seems like a simple and clean way to share collection type schemas. |
I'm not sure what you mean by "JSON array types" are shared. Some of the logic around generating the actual schema for the array types is hidden behind STJ's |
return false; | ||
} | ||
|
||
return GetHashCode(x) == GetHashCode(y); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Say more? It should function as an equality operator as long as it's symmetric, reflexive, and transitive, no?
|
||
internal sealed class OpenApiAnyComparer : IEqualityComparer<IOpenApiAny> | ||
{ | ||
public static OpenApiAnyComparer Instance { get; } = new OpenApiAnyComparer(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I usually give such types a private constructor, but I'm guessing you considered that to be too much ceremony.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean providing a private constructor over a static Instance
property or both? What's he value of the private constructor in the later case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I usually provide both so no one fails to notice the Instance
member and instantiates it themselves.
src/OpenApi/src/Transformers/Implementations/OpenApiSchemaReferenceTransformer.cs
Show resolved
Hide resolved
src/OpenApi/src/Transformers/Implementations/OpenApiSchemaReferenceTransformer.cs
Outdated
Show resolved
Hide resolved
src/OpenApi/src/Transformers/Implementations/OpenApiSchemaReferenceTransformer.cs
Outdated
Show resolved
Hide resolved
src/OpenApi/src/Transformers/Implementations/OpenApiSchemaReferenceTransformer.cs
Show resolved
Hide resolved
0a8f9c5
to
9ae6437
Compare
Apologies for the review request spam! I've rebased against main and changed the branch target for this PR from a feature branch to main. Support for friendly |
|
||
internal static class JsonTypeInfoExtensions | ||
{ | ||
private static readonly Dictionary<Type, string> _simpleTypeToName = new() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any other primitive-y types that should be added? For example I added DateOnly
, TimeOnly
, (U)Int128
and some others to the latest release of Swashbuckle (admittedly for a different purpose): JSON serializer context
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DateOnly
, TimeOnly
, and a few others on the referenced link get handled by the JsonSchemaMapper for us. I'll do a few more comparisons but I think the set we have here is good enough.
@@ -17,16 +19,39 @@ internal sealed class OpenApiSchemaStore | |||
private readonly ConcurrentDictionary<OpenApiSchemaKey, JsonObject> _schemas = new() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some similar code here in case that's useful too.
"DateTime": { | ||
"type": "string", | ||
"format": "date-time" | ||
}, | ||
"IFormFile": { | ||
"type": "string", | ||
"format": "binary" | ||
}, | ||
"IFormFileCollection": { | ||
"type": "array", | ||
"items": { | ||
"$ref": "#/components/schemas/IFormFile" | ||
} | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this potentially bleed out some ".NET-ness" where more generic names could be used (form-file
or whatever)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potentially, if applicable, I think I might prefer FormFile
over form-file
since it is a bit more homogenous with remaining conventions and more closely aligns with valid identifier names in most programming languages. There's probably some implementation bleed around disclosing that this is an interface.
Yes. IMO when generating schemas for types then I think Why is it better to use the schema as a key than the type? |
f68e774
to
5b44e5e
Compare
The behavior that you described here matches the current behavior of this implementation. I added this snapshot test to more clearly illustrate the behavior of the reference schema. When it comes to generating the schemas themselves, we're still using the .NET type. It's only when we generate reference IDs that we bring in the uniqueness of schemas. This makes the implementation more tolerant to:
|
👈🏽 for reviews -- I think I've addressed a bulk of the feedback around the comparers in 623fa19. Would like to get this in for preview6 if possible. 🙏🏽 |
{ | ||
counter++; | ||
_referenceIdCounter[targetReferenceId] = counter; | ||
_schemasWithReference[schema] = $"{targetReferenceId}{counter}"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible for referenceId "Foo"
and "Foo2"
to already be incoming referenceId's? I'm thinking something along the lines of X509Certificate
and X509Certificate2
but DTOs. It looks like this could conflict if the type without the 2 got an incremented counter due to special annotations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, this is a possible (although unlikely) edge case. In the event that this happens, the implementation will throw an exception to the user about the unexpected conflict.
There's a couple of ways I could see around solving this:
- Throw a more descriptive exception when we try to insert duplicate IDs so the user has a little bit more insight into what is going on here.
- Provide an options hook that allows users to configure custom schema name rules that override are defaults.
I'm leaning towards the later given it's a generally useful feature that gives users more flexibility around the naming conventions. That way, you can map things appropriately if there are overloaded type names in your application.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I filed #56305 to discuss the second approach further.
throw new InvalidOperationException("The schema reference ID must be set on the schema."); | ||
} | ||
|
||
public Dictionary<OpenApiSchema, string?> SchemasByReference => _schemasWithReference; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be an ImmutableDictionary? Also, should we just make it an internal property instead of a field+property?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simplified this to a property. I'm not the biggest fan of ImmutableDictionary
given the requirements on using the builder pattern for it, especially because we're building up the list of schemas as the schema service gets called.
src/OpenApi/src/Transformers/Implementations/OpenApiSchemaReferenceTransformer.cs
Outdated
Show resolved
Hide resolved
src/OpenApi/src/Transformers/Implementations/OpenApiSchemaReferenceTransformer.cs
Show resolved
Hide resolved
I'll try to make time this week to rebase my gRPC transcoding + OpenAPI PR on top of this and do some more testing. I'm curious to see the results in (semi) real world usage. |
Sounds good! I'm curious to see how this flows with the gRPC integration. Happy to refine/add new APIs if anything shakes out as a result of that. |
Note: this PR targets a feature branch.
This pull request adds support for generating reference IDs for JSON schemas that are duplicated across the OpenAPI document. This implementation consists of three components:
IEqualityComparer
implementations that allow us to compare whether twoOpenApiSchema
instances are equivalentIOpenApiDocumentTransformer
implementation that walks the schemas in the OpenAPI document model and replaces inline schemas with reference ID-based schemas where applicableIn this implementation, references are generated based on the equality of the generated schemas, not the .NET type they are associated with. With this strategy, we avoid leaking details of .NET type system to the OpenAPI document. For example, types that map to the same JSON schema (like
IEnumerable<Todo>
andTodo[]
) will use the same reference ID. This strategy also allows us to account for transformations made to the OpenAPI schema via transformers in the final document.