-
-
Notifications
You must be signed in to change notification settings - Fork 307
Initial description of annotation collection #610
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
<t> | ||
Annotations are collected by keywords that explicitly define | ||
annotation-collecting behavior. Note that boolean schemas cannot | ||
produce annotations as they do not make use of keywords. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this not obvious enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eh, to you and me, yes, but I can see someone asking about the boolean schemas contributing to an annotation that indicates that a boolean schema was used. Or something. I think it's harmless to state explicitly, but let me know if you think it is confusing.
jsonschema-core.xml
Outdated
A collected annotation MUST include the following information: | ||
<list> | ||
<t> | ||
The name of the annotation, which MUST be identical to the keyword |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about simply "The name of the keyword that produced the annotation"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean
The name of the annotation, which MUST be the name of the keyword that produced it
because I would be fine with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never mind, I see what you mean. Will update.
jsonschema-core.xml
Outdated
</list> | ||
</t> | ||
<t> | ||
If multiple schema locations attach values to the same instance location, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's an "instance location"? Maybe just "instance"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An instance is a document, an instance location may be the entire document, or may be some subset of the document which can be identified by a JSON Pointer. I use "instance location" a lot, and it is definitely not the same as "instance."
jsonschema-core.xml
Outdated
For example, one application may consider a "description" annotation that is | ||
in the same schema object as a "$ref" to override any "description" in the | ||
reference's target schema. A different application may prefer to concatenate | ||
all "description" annotations based on whatever ordering it defines. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're going to specify sometimes annotations can override one another, there should be a deterministic algorithm defined. Consider how CSS ranks which property is highest priority.
Presenting all the generated values as equals is probably the most theoretically pure way, but I'm not sure how practical this is for schema authors. I don't think I'll have a solid opinion on this until we see implementations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach comes from your repeated refusal to define how multiple values of default
get used, among other things. Which initially bothered me, but I eventually came to agree with you. Your argument was that different applications may want to handle it differently, which made sense- at the time, I think I had proposed three different variations on default
for different applications, which was ridiculous.
So that flexibility is what I am codifying here. Implementations should present enough information to applications so that they can choose their own usage. This is not CSS, and I do not see a one size fits all solution. Some keywords have obvious usage patterns, but those patterns are not the same for all keywords. And some keywords are not obvious. And some should arguably result in an application-level error as unresolvable.
When I have run this by other groups (such as OpenAPI) the flexibility has generally been seen as a positive. People have to get used to the idea, but once they do it seems to make sense to folks. I don't have a huge sample set here, but there's only so much begging and pleading I can do for feedback. Publishing a draft is the only thing that really provokes a lot of feedback.
Presenting all the generated values as equals is probably the most theoretically pure way, but I'm not sure how practical this is for schema authors.
They are not being presented as equals, they are being presented with their schema location, which allows applications to make their own distinctions.
I don't think I'll have a solid opinion on this until we see implementations.
The best way to get the broadest feedback from implementors is to avoid specifying a precedence. People will either make use of the flexibility, or we will learn that there is, actually, a clear typical usage, to the point that standardizing it will be beneficial. And then we can do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@awwright I'm going to add an example to make it more clear that we're not just punting on the question of precedence, but rather providing tools for applications to define their own in a clear way.
jsonschema-core.xml
Outdated
If any keyword in a schema object produces a false assertion | ||
result, then all annotations from all keywords in that schema | ||
object, and any of its subschemas or referenced schemas, MUST | ||
be discarded. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be shortened for clarity, I think.
"Annotations are only generated from instances that validate against the schema."
And care is needed so we don't confuse people implementing "not". The note about "not" is only (and should only be) non-normative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This wording is very deliberate. It may need clarification, but simply saying "annotations are only generated from instances that validate against the schema" is insufficient. There is a process going on of walking both the schema and instance, and it matters when and where in the process the annotations are discarded. I don't think there is actually another process that makes sense, but we should be clear about it rather than making implementors each re-derive it.
I'm happy to handle the advice on "not"
differently. We have not, to date, been all that rigorous about normative and non-normative, so I'd like to consider that question holistically rather than block this PR on it. I'm sure there are other things that have crept in that are non-normative. Probably a lot of Hyper-Schema since there are so many examples and "here's how to use it with HTTP" guidance in there. Feel free to file an issue on this, or we can just consider it during final review.
I can remove the "not"
part entirely if you are concerned about leaving it in. I would rather do that than hold up the rest of this PR, and then come back to the question of "not"
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@awwright I'm going to remove most of the "Annotations and Applicators" section to get rid of the details around not
and similar cases, and try to clarify why I'm using the "must be discarded" language in the "Annotations and Assertions" section.
This covers the basics of annotation collection, and its interaction with assertions and applicators. It does not get into the exact output structure for annotation results, which is intentional. The goal is to establish the process, and then nail down the format once the process is solid.
@awwright I've made several updates and tried to incorporate most of your feedback. I've removed a number of potentially confusing details- hopefully people will understand how they follow from the simpler descriptions. I also added an example of how an application might make use of multiple values. Which hopefully will make the point and usefulness more clear. If you still really want the spec to define behavior for every keyword, let's split that out into a new issue (in which case I will remove the controversial language from this PR so that we can merge the uncontroversial parts). The two things I did not change are:
|
jsonschema-core.xml
Outdated
A collected annotation MUST include the following information: | ||
<list> | ||
<t> | ||
The name of the keyword that defines the annotation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would "produces" be more accurate than "defines"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'll change that and then commit, thanks!
* Simplify wording around annotation keywords and names * Remove potentially confusing details about the collection process, particularly regarding applicators such as "not" * Add an example to clarify why the handling of multiple annotation values is deferred to applications, and show how applications might make use of this flexibility.
If validation error response is also powered by annotations, then boolean schema that produces the error should be in annotation, shouldn't it? |
NOTE: This PR uses language added in PR #609, which was recently merged.
This covers the basics of annotation collection (#530), and its interaction
with assertions and applicators. It does not get into the exact
output structure for annotation results, which is intentional.
The goal is to establish the process, and then nail down the
format once the process is solid.