Skip to content

Deprecate validateAndCollect on JsonSchema #1063

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 13, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
85 changes: 56 additions & 29 deletions doc/collector-context.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
### CollectorContext

There could be use cases where we want collect the information while we are validating the data. A simple example could be fetching some value from a database or from a microservice based on the data (which could be a text or a JSON object. It should be noted that this should be a simple operation or validation might take more time to complete.) in a given JSON node and the schema keyword we are using.
There could be use cases where we want collect the information while we are validating the data. A simple example could be fetching some value from a database or from a microservice based on the data (which could be a text or a JSON object. It should be noted that this should be a simple operation or validation might take more time to complete.) in a given JSON node and the schema keyword we are using.

The fetched data can be stored somewhere so that it can be used later after the validation is done. Since the current validation logic already parses the data and schema, both validation and collecting the required information can be done in one go.

Expand All @@ -10,6 +10,12 @@ The `CollectorContext` and `Collector` classes are designed to work with this us

The `CollectorContext` is stored as a variable on the `ExecutionContext` that is used during the validation. This allows users to add objects to context at many points in the framework like Formats and Validators where the `ExecutionContext` is available as a parameter.

By default the `CollectorContext` created by the library contains maps backed by `HashMap`. If the `CollectorContext` needs to be shared by multiple threads then a `ConcurrentHashMap` needs to be used.

```java
CollectorContext collectorContext = new CollectorContext(new ConcurrentHashMap<>(), new ConcurrentHashMap<>());
```

Collectors are added to `CollectorContext`. Collectors allow to collect the objects. A `Collector` is added to `CollectorContext` with a name and corresponding `Collector` instance.

```java
Expand All @@ -28,29 +34,44 @@ However there might be use cases where we want to add a simple Object like Strin

```java
CollectorContext collectorContext = executionContext.getCollectorContext();
collectorContext.add(SAMPLE_COLLECTOR, "sample-string")
collectorContext.add(SAMPLE_COLLECTOR, "sample-string");
```

To use the `CollectorContext` while validating, the `validateAndCollect` method has to be invoked on the `JsonSchema` class.
This method returns a `ValidationResult` that contains the errors encountered during validation and a `ExecutionContext` instance that contains the `CollectorContext`.
Objects constructed by collectors or directly added to `CollectorContext` can be retrieved from `CollectorContext` by using the name they were added with.

To collect across multiple validation runs, the `CollectorContext` needs to be explicitly reused by passing the `ExecutionContext` as a parameter to the validation.
Implementations that need to modify values the `CollectorContext` should do so in a thread-safe manner.

```java
ValidationResult validationResult = jsonSchema.validateAndCollect(jsonNode);
ExecutionContext executionContext = validationResult.getExecutionContext();
CollectorContext collectorContext = executionContext.getCollectorContext();
List<String> contextValue = (List<String>) collectorContext.get(SAMPLE_COLLECTOR);
AtomicInteger count = (AtomicInteger) collectorContext.getCollectorMap().computeIfAbsent(SAMPLE_COLLECTOR,
(key) -> new AtomicInteger(0));
count.incrementAndGet();
```

To use the `CollectorContext` while validating, the `CollectorContext` should be instantiated outside and set for every validation execution.

// Do something with contextValue
...
At the end of all the runs the `CollectorContext.loadCollectors()` method can be called if needed for the `Collector` implementations to aggregate values.

// To collect more information for subsequent runs reuse the context
validationResult = jsonSchema.validateAndCollect(executionContext, jsonNode);
```java
// This creates a CollectorContext that can be used by multiple threads although this is not neccessary in this example
CollectorContext collectorContext = new CollectorContext(new ConcurrentHashMap<>(), new ConcurrentHashMap<>());
// This adds a custom collect keyword that sets values in the CollectorContext whenever it gets processed
JsonMetaSchema metaSchema = JsonMetaSchema.builder(JsonMetaSchema.getV202012()).keyword(new CollectKeyword()).build();
JsonSchemaFactory factory = JsonSchemaFactory.getInstance(VersionFlag.V202012, builder -> builder.metaSchema(metaSchema));
JsonSchema schema = factory.getSchema("{\n"
+ " \"collect\": true\n"
+ "}");
for (int i = 0; i < 50; i++) {
// The shared CollectorContext is set on the ExecutionContext for every run to aggregate data from all the runs
schema.validate("1", InputFormat.JSON, executionContext -> {
executionContext.setCollectorContext(collectorContext);
});
}
// This is called for Collector implementations to aggregate data
collectorContext.loadCollectors();
AtomicInteger result = (AtomicInteger) collectorContext.get("collect");
assertEquals(50, result.get());
```

There might be use cases where a collector needs to collect the data at multiple touch points. For example one use case might be collecting data in a validator and a formatter. If you are using a `Collector` rather than a `Object`, the combine method of the `Collector` allows to define how we want to combine the data into existing `Collector`. `CollectorContext` `combineWithCollector` method calls the combine method on the `Collector`. User just needs to call the `CollectorContext` `combineWithCollector` method every time some data needs to merged into existing `Collector`. The `collect` method on the `Collector` is called by the framework at the end of validation to return the data that was collected.
There might be use cases where a collector needs to collect the data at multiple touch points. For example one use case might be collecting data in a validator and a formatter. If you are using a `Collector` rather than a `Object`, the combine method of the `Collector` allows to define how we want to combine the data into existing `Collector`. `CollectorContext` `combineWithCollector` method calls the combine method on the `Collector`. User just needs to call the `CollectorContext` `combineWithCollector` method every time some data needs to merged into existing `Collector`. The `collect` method on the `Collector` is called by explicitly calling `CollectorContext.loadCollectors()` at the end of processing.

```java
class CustomCollector implements Collector<List<String>> {
Expand All @@ -70,16 +91,25 @@ class CustomCollector implements Collector<List<String>> {

@Override
public void combine(Object object) {
returnList.add(referenceMap.get((String) object));
synchronized(returnList) {
returnList.add(referenceMap.get((String) object));
}
}
}
```

CollectorContext collectorContext = executionContext.getCollectorContext();
if (collectorContext.get(SAMPLE_COLLECTOR) == null) {
collectorContext.add(SAMPLE_COLLECTOR, new CustomCollector());
```java
private class CustomValidator extends AbstractJsonValidator {
@Override
public Set<ValidationMessage> validate(ExecutionContext executionContext, JsonNode node, JsonNode rootNode,
JsonNodePath instanceLocation) {
CollectorContext collectorContext = executionContext.getCollectorContext();
CustomCollector customCollector = (CustomCollector) collectorContext.getCollectorMap().computeIfAbsent(SAMPLE_COLLECTOR,
key -> new CustomCollector());
customCollector.combine(node.textValue());
return Collections.emptySet();
}
}
collectorContext.combineWithCollector(SAMPLE_COLLECTOR, node.textValue());

```

One important thing to note when using Collectors is if we call get method on `CollectorContext` before the validation is complete, we would get back a `Collector` instance that was added to `CollectorContext`.
Expand All @@ -96,12 +126,9 @@ List<String> data = collectorContext.get(SAMPLE_COLLECTOR);
If you are using simple objects and if the data needs to be collected from multiple touch points, logic is straightforward as shown.

```java
CollectorContext collectorContext = executionContext.getCollectorContext();
// If collector name is not added to context add one.
if (collectorContext.get(SAMPLE_COLLECTOR) == null) {
collectorContext.add(SAMPLE_COLLECTOR, new ArrayList<String>());
List<String> returnList = (List<String>) collectorContext.getCollectorMap()
.computeIfAbsent(SAMPLE_COLLECTOR, key -> new ArrayList<String>());
synchronized(returnList) {
returnList.add(node.textValue());
}
// In this case we are adding a list to CollectorContext.
List<String> returnList = (List<String>) collectorContext.get(SAMPLE_COLLECTOR);

```
```
2 changes: 1 addition & 1 deletion src/main/java/com/networknt/schema/CollectorContext.java
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@ public void combineWithCollector(String name, Object data) {
/**
* Loads data from all collectors.
*/
void loadCollectors() {
public void loadCollectors() {
Set<Entry<String, Object>> entrySet = this.collectorMap.entrySet();
for (Entry<String, Object> entry : entrySet) {
if (entry.getValue() instanceof Collector<?>) {
Expand Down
Loading