-
-
Notifications
You must be signed in to change notification settings - Fork 2k
[Core] Extract Gherkin compatibility layer #1804
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@mpkorstanje thanks for starting a branch for this. I've created a draft pull request so we have a place to discuss. |
I tried adding this to the cucumber-core pom: <dependency>
<groupId>io.cucumber</groupId>
<artifactId>gherkin</artifactId>
<version>5.2.0</version>
</dependency>
<dependency>
<groupId>io.cucumber</groupId>
<artifactId>gherkin</artifactId>
<version>8.0.1-SNAPSHOT</version>
</dependency> Maven does not allow that, so I think we need to create a separate core module targetting gherkin 8 |
Looks like the group id + artifact combination must be unique so we can only have 1 gherkin on the class path when using maven. Applied an exclusion to v5 in the calculator example and now it just works. This should be sufficient for a spike-release. |
@@ -5,7 +5,7 @@ | |||
import org.junit.runner.RunWith; | |||
|
|||
@RunWith(Cucumber.class) | |||
@CucumberOptions(plugin = {"json:target/cucumber-report.json"}) | |||
@CucumberOptions() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: Json plugin still uses gson.
private static CucumberFeature parseGherkin5(URI path, String source) { | ||
try { | ||
|
||
Stream<Messages.Envelope> messages = Gherkin.fromStream(new ByteArrayInputStream(source.getBytes(UTF_8))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This conversion back to stream is needed because we read the #encoding
pragma in Encoding.java. Ideally this is supported by Gherkin. Or we deprecate and drop it.
Stream<Messages.Envelope> messages = Gherkin.fromStream(new ByteArrayInputStream(source.getBytes(UTF_8))); | ||
|
||
Parser<Messages.GherkinDocument.Builder> parser = new Parser<>(new GherkinDocumentBuilder()); | ||
Messages.GherkinDocument gherkinDocument = parser.parse(source).setUri(path.toString()).build(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure where this URI goes but if it comes out again as string it might be useful. Would be better if this was a URI type object.
return new PickleCompiler().compile(document, path.toString(), source) | ||
.stream() | ||
.map(pickle -> new Gherkin8CucumberPickle(pickle, path, document, dialect)) | ||
.collect(Collectors.toList()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't use the messages here because we need some info from the AST in cucumber-jvm
this make it hard to use just the pickles.
@Override | ||
public int getScenarioLine() { | ||
List<Location> stepLocations = pickle.getLocationsList(); | ||
return stepLocations.get(stepLocations.size() - 1).getLine(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this still correct? I had to swap this in Gherkin8CucumberStep
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pickle steps coming from Examples
rows have two locations - in this order:
- index 0: The step
- index 1: The table row
It's sort of a stack trace. The line at index 1 "calls" the line at index 0.
By always getting the last element we point to the table row line (for pickles from Examples) and the step line for regular scenarios.
I think that's what we want to report as it points people at the line they most likely need to fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then I would appreciate a descriptive method to access the last line number. Perhaps we can call it pickleLine
because it is the line that defines the pickle. It would remove one obstacle that keeps me from using plain pickles in Cucumber.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that idea. Just like we've talked about adding a GherkinDocumentsModel
class to cucumber-messages, we could add a PickleModel
class to cucumber-messages as well.
In short - some utility classes for getting interesting data out of those dumb protobuf structs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Though I think utility classes are an anti-pattern when you own or can design the domain objects.
When ever you run into a pattern like:
Object dumb = ...
String propertyA = dumb.propertyA();
String propertyB = SmartUtils.propertyB(dumb)
It's better to rewrite it to a facade and use the facade everywhere. E.g:
Object dumb = ....
Smart smart = new Smart(dumb);
String propertyA = smart.propertyA()
String propertyB = smart.propertyB();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we’re after the same thing! This is basically the facade pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could provide different flavours of facades around protobuf messages. One could process a stream of historical messages (stored in an event store like database, heck could even be Kafka). This way we could generate more interesting data like historical trends for execution time, flakiness, failure rates or even lead times from new to passing. The computed data could be stored in the database (if the computation is time consuming. This could be implemented as streams and pipes, allowing for composition.
if (dialect.getButKeywords().contains(keyWord)) { | ||
return StepType.BUT; | ||
} | ||
throw new IllegalStateException("Keyword " + keyWord + " was neither given, when, then, and, but nor *"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally this would come from the pickle.
}); | ||
} | ||
|
||
return Stream.empty(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No inheritance. 😢
pom.xml
Outdated
<dependency> | ||
<groupId>io.cucumber</groupId> | ||
<artifactId>gherkin</artifactId> | ||
<version>${gherkin.version}</version> | ||
<version>8.0.1-SNAPSHOT</version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: Make this a property for easier maintaince.
The |
|
My reaction too. But we can leave those alone and implement the message formatter. |
What do you mean by implementing the message formatter? Currently, So we need to remove the If we're going to leave all those formatters unchanged, perhaps we should move them into |
The idea was to ship with Gherkin 5 and allow people to test drive the protobuf message formatter with Gherkin 8 without getting into woods of fixing the existing formatters. |
I agree with that goal. Do you agree we should move everything that depends on gherkin5 into the new cucumber-gherkin5 module? |
No. That would be allot of work all the tests for the formatters are integration tests. |
So how do you suggest we break the hard dependency on Gherkin5 from cucumber-core so that Gherkin8 can be loaded at runtime? |
Like this:
And tell people not to use any formatters at runtime. |
Aha! |
Been thinking about this problem some more. And I think we've been ignoring the elephant in the room. People who write plugins (us included) will need an easy way to map pickles back to AST nodes - the whole reason You made the suggestion to use This approach has some other advantage too. You've mentioned in the past the desire to support other formats for feature files gherkin. A stable representation will allow multiple parser implementations. So in summary:
|
This all sounds good to me. Not sure when I suggested to use As you know I want to decouple Cucumber even more from Gherkin and "features". I intend to start implementing a Markdown->Pickle compiler soon, and maybe even one for Excel. At this point, the Gherkin and "feature" abstractions break down. It's ok for formatters to depend on the input format, so they can render results on top of the input, which requires custom mapping from pickles back to the input model. Cucumber Core shouldn't know about input formats - it should only use pickles for execution and reporting. I think of it as this pipeline (note how it's input format agnostic - no gherkin assumptions):
If I understand you correctly, you're suggesting implementing a query API for each input AST to make it easier to look up AST nodes from pickles. I think that makes a lot of sense. @vincent-psarga and I actually implemented this logic in Go when we ported the JSON formatter. This could easily be refactored and decoupled from the JSON parts, and then we could move it to the How does this sound? |
Sound in theory but I don't think pickles in their current state contain sufficient information to support both execution and reporting. I also think we'll need a Pickle tree as part of this. And I think if we do support multiple input formats reporters need to be agnostic of the input as well. Currently I don't see how that is supported. |
I think it will be quite difficult to implement formatters that are agnostic of the input format and procude a report that is similar to the input format. I'm not convinced about the hierarchy thing - I have a slightly different vision. Cucumber-React / HTML Formatter is able to display a hierarchy - not because we have hierarchy information in the pickles, but because it traverses the I don't know much about text-to-speech engines, but I imagine they have different implementations for different spoken languages. You can't feed French text to a Russian engine and expect good results. For the same reason I think a Gherkin pretty formatter (for example) can only pretty-print Gherkin, but not Markdown or Excel. Unless we come up with a montrosity abstraction for representing all possible document formats in the world. I think it's more useful to group formatters in two categories:
IDA formatters only need IDD formatters need AST + The AST is different for each input format:
Each IDD formatter will only work with one kind of input document. The Gherkin pretty formatter will not work with Markdown and Excel for example (like text-to-speech engines - it only works with one language). As you've pointed out, looking up AST nodes from pickles is complicated, because the AST has to be traversed first to build a lookup table. This is what I think that concept is good - we just need to change their name so we don't give the impression that this is a universal model for all kinds input documents:
I imagine at runtime we'd have a single instance of e.g. |
|
||
public ProtobufFormatterAdapter(OutputStream out) { | ||
try { | ||
Class<EventListener> delegateClass = (Class<EventListener>) getClass().getClassLoader().loadClass(PROTOBUF_FORMATTER); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now this is okay.
But we should upgrade the plugin system to make registering plugins via SPI possible. Then the protobuf formatter could live in its own module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Completely agree - I added it to the gherkin8 module out of pure laziness since I think we're still in spike mode. Let's do that before we merge this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool!
|
||
import static java.util.Collections.singletonList; | ||
|
||
public class ProtobufFormatter implements EventListener { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally the ProtobufFormatter
would be living in the core module. But can't because gherking 5 and 8 have clashing artifact names. Do we want to resolve this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it’s pragmatic to keep it here for now. We can move it once we get rid of gherkin 5.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've resolved it by moving the protobuf formatter to it's own module. It only has a dependency on cucumber-plugin
and a shaded dependency on Gherkin 8.
I just checked out the contents of This will make it easier for people who want to consume reporting out to use the library to use it because it only contains what they'll need. Ideally we also put the smart delegates that use the message in a separate library (no smarts in And if we decided for some reason that protobuff doesn't work out we can switch to json, xml or some other format while keeping the same API. |
I think we can safely exclude the We use https://github.com/cucumber/cucumber/blob/master/.templates/java/scripts/check-jar.sh to verify jar contents during a release. I suggest we do 2 things:
|
We can use |
Regarding your suggestion to split up messages in smaller chunks. I want to do that eventually, but not until it's more stable. Some separation of concerns can be achieved in the single file by nesting related messages. The There has been a fair amount of flux in the message definitions lately, and I think that will continue until we have some feedback from real users. When it settles down a bit and has a track record we can clean it up and separate it. For now I think it is more important to make changes more easily. |
What should be possible now is the following:
In my mind this meets the criteria for a smooth migration. People can setup a reporting tool that consumes protobuf. Then drop the json formatter and switch to gherkin 8 to start using the rule keyword.
Fair enough. If possible I'd like to restrict the If anything I don't think piggy backing messages onto regular events is the way to go. Ideally we'd create a separate event that contains only the Message. But I would rather not commit to any specific implementation right now. |
Fixed a few formatters. Only the PrettyFormatter left to fix. The JSON and HTML are lost causes as expected. |
A think to consider: The protbuf formatter prints either ndjson or protobufs. Perhaps we should rename it to the cucumber messages formatter? That we it describe the protocol rather then the serialization format by the protocol. |
Notes to self:
|
On our last call we discussed how to implement new message-based formatters. Did we all agree we'll write them in the monorepo, using a shared test suite similar to gherkin's. We could use So I assume the new pretty formatter will go into the monorepo? There are a few there already, so should be relatively easy to parrot the design pattern. Essentially a Shall we start with a simple one that essentially just prints pickles? We could use Where do you think the pickle tree abstraction belongs? Inside |
Fixing the TestNG and JUnit formatter came down to changing one small thing. We'll still have to lift them out but they're no longer in the way of a release. This would be a change we can do in v5.x rather then a v5.0. So one less Yak to shave for now. There are a few implicit assumptions around pretty formatter. Before committing to polyglot project and deal with the input/ouputstream stuff. I'd like to see what the output actually looks like and how it behaves during parallel execution and if we can reasonably work around that. Currently Cucumber JVM has reasonable system for dealing with messages that occur out of order. We lose that mechanism if we switch to protobufs. If it all does work out it should be easy to lift the pretty formatter. The messages/events we need for the pretty formatter have a 1-1 correspondence. |
Yeah we'd need something like |
Done. |
I think we have something mergeable now. Only 5 things remain:
And maybe write a proper commit message for this whole thing. Should include a link to: https://github.com/cucumber/cucumber/tree/master/cucumber-messages |
bf5ed43
to
11fce15
Compare
Gherkin 6 introduced the `Rule` keyword and a new AST structure. This poses several problems. 1. Cucumber-JVM is closely tied to the Pickle structure of Gherkin 5. 2. The HTML and JSON formatters use the Gherkin 5 parser. 3. The JSON formatter is the defacto output standard for third party tools. 4. There is no schema for the JSON formatters output. To phase out the JSON formatter we'll need an alternative. This alternative is the `message` formatter. This plugin will write the output of Cucumbers execution to protobuf or ndjson file using the schema defined in `cucumber-messages`. Because `cucumber-messages` for Gherkin can only be generated by Gherkin 8 we need a way to run both Gherkin 5 and Gherkin 8 next to each other. By extracting a compatibility layer we can use both Gherkin 5 and Gherkin 8.
11fce15
to
8f56641
Compare
Broken by #1804. While covered by unit tests, the unit tests were broken by a change in the way feature identifiers were parsed. When parsing a feature path relative paths are expanded to full file uris. This resulted in the uri of the pickle not matching the uri provided by the test.
We want to support both Gherkin 5 and Gherkin 8 + Cucumber Messages in the next major release of Cucumber-JVM.