Skip to content

Regular Expressions Match Inconsistently #1581

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
SJrX opened this issue Mar 13, 2019 · 3 comments · Fixed by #1722
Closed

Regular Expressions Match Inconsistently #1581

SJrX opened this issue Mar 13, 2019 · 3 comments · Fixed by #1722
Milestone

Comments

@SJrX
Copy link

SJrX commented Mar 13, 2019

Summary

A valid regular expression fails to match with Cucumber JVM 4.2.6

Expected Behavior

When a step definition has a regular expression using \d+ (more than one digit) and no anchors, it should match.

Current Behavior

Crash

Possible Solution

Not sure

Steps to Reproduce (for bugs)

Feature: Feature

  Scenario: Anchors
    Given 0 anchors1
    Given 0 anchors2
    Given 0 anchors3
public class Steps {

    @Given("(\\d+) anchors1$")
    public void anchorsOne(Integer int1) {    }

    @Given("{int} anchors2")
    public void anchorsTwo(Integer int1) { }

    @Given("(\\d+) anchors3")
    public void anchors(Integer int1) { }
}

0 anchors1 (PASS)
0 anchors2 (PASS)
cucumber.runtime.CucumberException: Step [(\d+) anchors3] is defined with 1 parameters at 'foo.Steps.anchors(Integer) in file:/.../classes/'.
However, the gherkin step has 0 arguments.
Step text: 0 anchors3
at cucumber.runner.PickleStepDefinitionMatch.arityMismatch(PickleStepDefinitionMatch.java:84)
at cucumber.runner.PickleStepDefinitionMatch.runStep(PickleStepDefinitionMatch.java:36)
at cucumber.runner.TestStep.executeStep(TestStep.java:63)
at cucumber.runner.TestStep.run(TestStep.java:49)
at cucumber.runner.PickleStepTestStep.run(PickleStepTestStep.java:43)
at cucumber.runner.TestCase.run(TestCase.java:45)
at cucumber.runner.Runner.runPickle(Runner.java:40)
at cucumber.runtime.Runtime$1.run(Runtime.java:82)
at cucumber.runtime.Runtime$SameThreadExecutorService.execute(Runtime.java:217)
at cucumber.runtime.Runtime.run(Runtime.java:79)
at cucumber.api.cli.Main.run(Main.java:26)
at cucumber.api.cli.Main.main(Main.java:8)

Context & Motivation

The lack of matches hurt me while upgrading from Cucumber 2.x to 4.2.6, I had to redo a number of regular expressions, by adding anchors.

Your Environment

  • Version used: 4.2.6
  • Operating System and version: Linux, Arch.
  • Link to your project: n/a
@SJrX
Copy link
Author

SJrX commented Mar 13, 2019

I did notice that replacing \d with [0-9] does work as expected. I spent ~5 minutes digging into the code, but did not notice or figure out what exactly the cause is, other than it doesn't seem to recognize the capture group as an argument.

@mpkorstanje
Copy link
Contributor

mpkorstanje commented Mar 15, 2019

With the introduction of cucumber expressions all regular expressions should start with ^ and end with $. There is an exception build in when the symbols [].*+ are used inside brackets (e.g. (.*), but I think we should remove that exception to allow for more consistent behavior.

https://github.com/cucumber/cucumber/blob/master/cucumber-expressions/java/heuristics.adoc
https://github.com/cucumber/cucumber/blob/master/cucumber-expressions/java/src/main/java/io/cucumber/cucumberexpressions/ExpressionFactory.java

@mpkorstanje mpkorstanje added this to the 5.0.0 milestone Mar 15, 2019
@SJrX
Copy link
Author

SJrX commented Mar 15, 2019

Yeah I think removing the exception makes a lot of sense as I had no clue about that feature and would never have guessed that was by design.

mpkorstanje added a commit to cucumber/common that referenced this issue Mar 16, 2019
Explaining when a string is interpreted as a cucumber expression should
be rather straight forward. In essence the explanation should be:

    If you surround a string with `^` and `$` it will be a regular
    expression. If not, it is a cucumber expression.

However the heuristic complicates this by checking if a capture group
contains regular expression symbols `[]+.*` with the assumption that
these are not used in normal writing.

This allows us to write little marvels like:

    this looks\( i.e: no regex symbols) like a cukexp
    a heavy storm forecast \(BF {int}+)
    the temperature is (\+){int} degrees celsius

Which look deceptively like cucumber expressions but are in fact regular
expressions. Simplifying the heuristic remove this ambiguity. This does
comes at the cost of being able to recognize obvious regular
expressions.

    this (.+) like a regexp
    this (\d+) like a regexp
    this ([a-z]+) like a regexp

But I think the simplicity is worth it.

Related:
  * cucumber/cucumber-jvm#1581
  * #515
@mpkorstanje mpkorstanje mentioned this issue Aug 2, 2019
11 tasks
aslakhellesoy pushed a commit to cucumber/cucumber-expressions that referenced this issue Sep 20, 2021
Explaining when a string is interpreted as a cucumber expression should
be rather straight forward. In essence the explanation should be:

    If you surround a string with `^` and `$` it will be a regular
    expression. If not, it is a cucumber expression.

However the heuristic complicates this by checking if a capture group
contains regular expression symbols `[]+.*` with the assumption that
these are not used in normal writing.

This allows us to write little marvels like:

    this looks\( i.e: no regex symbols) like a cukexp
    a heavy storm forecast \(BF {int}+)
    the temperature is (\+){int} degrees celsius

Which look deceptively like cucumber expressions but are in fact regular
expressions. Simplifying the heuristic remove this ambiguity. This does
comes at the cost of being able to recognize obvious regular
expressions.

    this (.+) like a regexp
    this (\d+) like a regexp
    this ([a-z]+) like a regexp

But I think the simplicity is worth it.

Related:
  * cucumber/cucumber-jvm#1581
  * cucumber/common#515
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants