Skip to content

Restriction of processing $vocabulary to meta-schemas is unnecessary and confusing #1098

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
handrews opened this issue May 8, 2021 · 8 comments
Labels

Comments

@handrews
Copy link
Contributor

handrews commented May 8, 2021

I've noticed that people often think of $vocabulary as a very strange case of keyword, specifically asking why it is in the meta-schema and not the schema. The answer is that like all keywords in a schema, it describes the instance. $vocabulary is only meaningful when the instance is a schema (and therefore $vocabulary is being processed in a meta-schema).

But it's not harmful, except for performance, to "process" it in a normal schema. It has the same semantics, meaning that it indicates the JSON Schema keywords that could be used in the instance... which only makes sense if the instance is a schema.

But $vocabulary is essentially an annotation. It is applied to the instance, and the application (the same schema validator that was already running) uses it to load vocabulary support if needed. Annotating a non-schema instance with $vocabulary does nothing.

I think we can replace the last paragraph of §8.1.2:

The "$vocabulary" keyword MUST be ignored in schema documents that
are not being processed as a meta-schema. This allows validating a
meta-schema M against its own meta-schema M' without requiring the
validator to understand the vocabularies declared by M.

with something about $vocabulary behaving as an annotation, which would then allow us to completely remove §9.1.3 Detecting a Meta-Schema. Or just relax it from a MUST to a SHOULD or even a MAY, as there may well be some optimizations possible.

But I never really liked making meta-schema processing special, and having taken a break to come back and look at this with fresh eyes, I don't think it's needed at all. And maybe that would cut down on confusion about the nature and placement of $vocabulary. It truly is an annotation, which is further processed by the application that called the validator, which just happens to be the JSON Schema implementation itself. It may be worth a note that it's entirely reasonably to just look in a referenced meta-schema for $vocabulary in order to load features if validation against the meta-schema is turned off.

@Relequestual

This comment has been minimized.

@handrews
Copy link
Contributor Author

handrews commented May 10, 2021

Note: @Relequestual suggested that I hide his comment just above this one- I'll add more here soon to clarify what the topic is that I'm trying to address. Although the off-topicness illustrates my point that $vocabulary appears more confusing than it is, so in that sense it was also on-topic 😅

@handrews
Copy link
Contributor Author

handrews commented Jun 1, 2021

I think this ends up just being a clarification. The statement The "$vocabulary" keyword MUST be ignored in schema documents that are not being processed as a meta-schema. is not testable AFAIK (paging @karenetheridge ) so simply observing that it naturally would not have an effect is a clarification.

We cannot state that it MUST be treated as an annotation in the patch release (and that's worth further thought anyway), so I think I'll just put in a CREF noting that that is the current direction of thought, which we can formalize one way or the other in the next non-patch draft.

If anyone thinks this should not go in, simply object here or on the PR and that's enough to bump it out of the patch release.

@karenetheridge
Copy link
Member

The "$vocabulary" keyword MUST be ignored in schema documents that are not being processed as a meta-schema. is not testable AFAIK

We can test for this by using an unknown $vocabulary URI in a schema and checking that validation can still proceed successfully.

@handrews
Copy link
Contributor Author

handrews commented Jun 3, 2021

@karenetheridge yeah, that's definitely testable. Having thought about it more, I think I can work this out so that the "$vocabulary behaves mostly like an annotation" approach produces the same testable requirement. For now, it will have to be "behaves mostly like" because requiring it to actually be collected as an annotation would be a conformance change.

@karenetheridge
Copy link
Member

For now, it will have to be "behaves mostly like" because requiring it to actually be collected as an annotation would be a conformance change.

Agreed.

..But when we get there, I would propose having $schema generate an annotation, rather than $vocabulary, because it's $schema that appears in schemas (vs metaschemas), and I think that might satisfy @jdesrosiers's desire for keyword-source information in validation results -- all the vocabulary information can be found at the URI indicated by that $schema annotation, and it will appear in validation results whenever the metaschema happens to be altered. (I'm only writing this here so it's not forgotten; I am not attempting to derail the conversation or attempt to change the spec at a point in time when changes are not being considered.)

@handrews
Copy link
Contributor Author

handrews commented Jun 6, 2021

@karenetheridge $schema is the only keyword that applies to the schema that contains it, rather than to the instance. $schema says nothing about the instance at all. It is essentially a self-annotation on the schema (an annotation rather than a reference because it is not automatically followed). $vocabulary, however, applies to the instance.

Given:

  • plain JSON Instance (I)
  • Schema (S)
  • Meta-Schema (MS)

Those keywords operate as follows:

  • $schema in S annotates S with the URI for MS, which can then be applied to S if desired.
  • $vocabulary in S annotates I with the vocabulary semantics it could use... but I doesn't use them because it's plain JSON
  • $schema in MS annotates MS with the URI for whatever it's own meta-schema is (possibly itself, but maybe not)
  • $vocabulary in MS annotates S with the available vocabulary semantics, which it does use when it is in turn applied to I because S is a JSON Schema.

So $schema and $vocabulary aren't operating on the same target, so there's not really a concept of "we should annotate with one vs the other." They don't annotate the same thing.

@handrews
Copy link
Contributor Author

Closing in favor of #1281 where I think I did a much better job explaining this.

@handrews handrews closed this as not planned Won't fix, can't repro, duplicate, stale Aug 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

3 participants