timeout using Thread.interrupt() is unreliable #1506

christianhujer · 2018-11-23T08:12:49Z

Summary

I have the following Java code:

    @Before(timeout = 500)
    public static void startSomething() {
        someOperation();
    }

The class cucumber.runtime.Timeout of cucumber-jvm/core fails to actually timeout in all situations.

Expected Behavior

The expected behavior is that if someOperation() takes longer than 500ms, Cucumber always fails the test.

Current Behavior

The current behavior is that if someOperation() takes longer than 500ms, the behavior depends on the situation.

If someOperation() can react to Thread.interrupt(), the timeout works and Cucumber fails the test.
Example:

    static void someOperation() throws InterruptedException {
        Thread.sleep(1000);
    }

If someOperation() performs blocking I/O which never returns, the timeout is ineffective and Cucumber keeps on running forever.
Example:

    static void someOperation() throws IOException {
        System.in.read();
    }

The problem is that cucumber-jvm/core class Timeout relies on Thread.interrupt(). This way, timeouts only work for interruptible operations. This has two problems:

Code which is not interruptible can exceed the timeout without reporting the excess.
Code which got interrupted but does not clear the interrupt flag can also not be tested with Timeout.

Possible Solution

The solution is to not use Thread.interrupt() to implement test timeouts. If a preemptive timeout is desired, it could be implemented like org.junit.jupiter.api.AssertTimeout.assertTimeoutPreemptively(). If a non-preemptive timeout is desired, it could be implemened like org.junit.jupiter.api.AssertTimeout.assertTimeout().
I think that timeouts in annotations, unless specified otherwise, would probably best be preemptive timeouts. The intention of using interrupt() probably was to leave the decision between preemptive and non-preemptive timeouts to the code under test. But that caused Cucumber timeouts to not be suited to create robustness for test-runs in case code runs into endless loops or blocks.

Workarounds

Implement my own timeout() in @Before which will run the method to be timed out in a separate thread and kills the thread in case of timeout.
Use org.junit.jupiter.api.Assertions.assertTimeoutPreemptively().

Steps to Reproduce (for bugs)

Steps to reproduce: See Summary / Expected Behavior / Actual Behavior

Context & Motivation

In acceptance tests for a backend (written in Golang), I'm using Cucumber-Java (because of Selenium) to test the REST endpoints and its frontend. There's a @Before to start the backend process. When the process is started, it is expected to print its URLs on STDOUT which the Java wrapper would parse and keep for the other steps. The process was bogus, not printing the URLs, but running, so the readLine() call reading from the process' STDOUT would not return. I expected @Before(timeout=500) to catch this, but it didn't.

Your Environment

Version used: io.cucumber:cucumber-java:4.2.0

Probably not relevant:

Operating System and version: Kubuntu 18.10, OpenJDK 11.0.1

The text was updated successfully, but these errors were encountered:

mpkorstanje · 2018-11-23T12:02:42Z

I see you've found one Cucumbers warts. Timeout was bolted on and never changed because it was good enough for most use cases.

If you're looking to fix this beware that each step and hook in a scenario should be called by the same thread. Simply creating a new thread in Timeout to invoke the step on is not sufficient. Failing to do so means that ThreadLocal will not work as expected in the system under test. The executing thread should be created in either the Runner or TestCase and then reused for each step and hook in a scenario.

christianhujer · 2018-11-24T08:19:37Z

I almost started making a change, but the ThreadLocal that you've mentioned was what kept me from just doing this. It may break tests of existing users which rely on ThreadLocal. I don't know if ThreadLocal is used by Cucumber itself, but I would expect that this shouldn't matter, because the implementations of steps should not access ThreadLocal variables created/owned by Cucumber, isn't it?

So far, I'm thinking along those lines/options for users:

enum TimeoutStrategy {
    /** The main thread will be interrupted using {@code Thread.interrupt()} when the timeout expires.
     * This is the original timeout implementation.
     * The tests will hang if the step does not complete and does not respond to {@code Thread.interrupt()}.
     */
    INTERRUPT,

    /** The step will be executed and marked failed if it took longer than the given time.
     * There is no protection from rogue steps.
     * This implementation is similar to JUnit's {@code assertTimeout()}.
     */
    JUST_WAIT,

    /** The step will be executed in a new thread which will be killed the moment it hasn't completed at the timeout.
     * This implementation is similar to JUnit's {@code assertTimeoutPreemptively()}.
     */
    FORK_PREEMPT,

    /** The whole test will be stopped the moment step exceeds the timeout. */
    EXIT,

    /** Use a custom strategy which implements the following interface: TODO. */
    CUSTOM
}

Any opinions on that?

mpkorstanje · 2018-11-24T10:13:29Z

The problem is with ThreadLocal in the system under test, not in Cucumber. For example the System class keeps track of an single long running operation in a ThreadLocal. The first test works, the second will fail because the awaiting thread never started an operation.

class System {
   ThreadLocal<Operation> operation = ....
   
   void startOperation(){
         // create a long running operation and stores a reference to it in the thread local 
   }

   void awaitOperationFinished(){
      // awaists the operation stored in the thread local, throws if there is no operation
    }
}

Assertions.assertTimeoutPreemptively(Duration.ofSeconds(5), () -> {
  // Given
  System system = new System();
  // When 
  system.startOperation();
  //Then
  system.awaitOperationFinished();
});

// Given
System system = new System();
// When 
system.startOperation();
//Then
Assertions.assertTimeoutPreemptively(Duration.ofSeconds(5), () -> system.awaitOperationFinished());

JUnit5 has the advantage that the given-when-then steps all happen in the same method. So assuming the given and when steps take a negligible amount of time, the steps can be moved into the body of the lambda.

However in Cucumber the given-when-then steps are in different methods and may take non-negligible amounts of time.

So in a really simplified fashion what we're looking for is more like this:

@Given
void given(){}
@When
void when(){}
@Then(timeout = 5, SECONDS)
void then(){}

void test(){
       ExecutorService executor = newSingleThreadExecutor();
       executor.submit(this::given).get();
       executor.submit(this::when).get();
       executor.submit(this::then).get(5L, SECONDS); // should read from annotation  

      // Depending on strategy await or abandon the executor and fail the test.
}

christianhujer · 2018-11-24T14:24:30Z

That will probably be sufficient for most users. Means, each Scenario to be run in a thread of its own. That makes a lot of sense, and it's actually also a nice preparation for running each Scenario in a separate Thread for speeding up tests in multicore machines. Or can Cucumber do that already?

I've tried to trace down the location in the code that needs to be changed. What I've found is TestCase.run(EventBus bus). This seems to me the right place for catching timeouts.

I assume that the afterHooks should probably run even in case a step failed its timeout.
@Before and @After probably need a slightly different handling somewhere else, and in a multithreaded runner would not share the same Thread as the Scenarios.

If done nicely, this would actually also have a positive effect on the stack traces - less Cucumber code on the stack trace of user exceptions.

I'll try something there, let's see where it takes me.

mpkorstanje · 2018-11-24T15:15:32Z

Or can Cucumber do that already?

Cucumber supports parallel execution since v4.

I've tried to trace down the location in the code that needs to be changed. What I've found is TestCase.run(EventBus bus). This seems to me the right place for catching timeouts.

I think the ExecutorService should be owned by the runner and then passed down to Invoker.invoke. It is a pretty deep call hierarchy but the information about the timeout is only available down there.

I assume that the afterHooks should probably run even in case a step failed its timeout.
@Before and @After probably need a slightly different handling somewhere else, and in a multithreaded runner would not share the same Thread as the Scenarios.

The current multi threaded implementation executes all steps and hooks on the same thread. So that is currently not a problem.

However not executing them, or changing the execution thread will be a problem. Hooks must be executed on the same thread as steps, for the same reason the thread must be the same for all steps.

Another annoying part is that hooks have invoke around semantics. Once the scenario or step they belong to is executed, they can not be skipped. But if we mark a step as failed due to a pre-emptive timeout, the thread that should execute the after hook is still blocked. Not to mention that trying to tear down the system under test may be another blocking operation.

mpkorstanje · 2018-11-24T15:27:25Z

Honestly, the more I think about it, the better it might be to use non-blocking IO in scenarios where you can't trust the input to contain a new line. I reckon it will be a smaller headace then trying to fix Cucumber. You'll end up fighting all the constraints that are already in place.

christianhujer · 2018-11-25T08:27:35Z

On using non-blocking I/O: that may fix exactly one (my) given class of situations. Although using JUnit.assertTimeoutPreemptively() was by far the much easier fix. But it cannot deal with the general case that test code might call something that misbehaves, in this case, times out, ignoring interrupts, and that test frameworks should be able to handle that. In the current situation, people are forced into redundancy: They put timeout on their annotations, and still have to put another timeout on a higher level, like their CI/CD pipeline.

For now, I think the least that should be done is a big stern warning on the timeout attribute of the step annotations that the timeout is implemented using interrupt() and thus cannot be relied on, and recommend that people use something else instead, like call Assertions.assertTimeoutPreemptively() from JUnit or something like that if they need reliable timeouts.

Regarding the ThreadLocal, I wonder if it's really such a common issue. As long as users would have the choice to chose the timeout strategy, it would be possible to offer JUnit-like hard timeouts on separate threads, with the drawback of not coping with ThreadLocal, or the existing behavior, coping with ThreadLocal but not having hard timeouts. I for my part have never in my life so far written or dealt with code that has used ThreadLocal in a way that it would become an issue. After all, ThreadLocals are sort of global variables, which is why I avoid them. Of course, that's a limited, non-representative sample set. And I'm not denying that there are projects for which ThreadLocal is a topic. Those projects shouldn't be broken by a Cucumber update. So, any potential new timeout strategy that runs steps on separate threads, thus breaking ThreadLocal, or in any other way is potentially incompatible with the current behavior, should be a non-default option.

The parallel execution that's already there is implemented in Runtime.run() where it passes a pickleEvent to runnerSupplier.get().runPickle() submitted to the executor, isn't it?

As a side note, for "outsiders" like me, "Pickle" is a very intransparent word. What business logic does it represent? I have no clue whether a "Pickle" is a Scenario or a Feature or something else. From the code, I guess that a Pickle is a Compiled Feature. But I'm just guessing. Software is hard to understand when the terms internally do not represent the business logic, and I can't see how the word Pickle represents any business logic of Cucumber. I think I should not have to debug or read the implementation of functions like compiler.compile() to understand what "Pickle" actually means. It may have been funny for the authors, but gives potential contributors a hard time.

mpkorstanje · 2018-11-25T11:49:10Z

I am not opposed to adding more documentation. A factual rather then stern tone is prefered. I reckon a note similar to the one provided with assertTimeout should suffice.

We can also avoid the problem of people being mislead by timeout by removing it all together. This moves the burden of dealing with unreliable code closer to the owner of that problematic code. I would actually prefer this solution as it removes a rather warty feature from Cucumber. This would be similar to what happened with transition from JUnit4 to JUnit5. @Test(timeout=5) has been replaced by assertTimeout and assertTimeoutPreemptively. Currently develop-v5 has been set to use only Java 8 so we could start deprecating it there.

While indeed a bit obscure, a pickle is neither a feature nor a scenario. As an example a feature with two scenarios and one scenario outline with two examples consists of four pickles. To the best of my knowledge there term to describe this naturally. You can find more about it in cucumber/gherkin.

Partial fix for #1506

mpkorstanje · 2019-01-26T14:04:20Z

I think that we should deprecate timeout in v5 and remove it in v6.

mpkorstanje · 2019-07-12T20:14:31Z

Library alternatives to recommend:

mpkorstanje · 2019-07-12T20:38:02Z

Closing this in favor of #1694.

@given

## Summary It is possible to provide a timeout to step definitions, however the semantics are unreliable ( #1506). ```java @given(value = "^I have (\\d+) cukes in my belly$", timeout = 5000) public void I_have_cukes_in_my_belly(int cukes) throws Throwable { } ``` Fixes: #1694 ## Current Behavior When the step starts a long running task Cucumber will attempt to interrupt the step once the timeout period is exceeded. If the long running task ignores the interrupt Cucumber will however not stop the test. Depending on the context this behavior is either desired or undesirable. See #1506 for in detail discussion. Additionally the current implementation is complex and has been prone to failures (#1244, #1241, #811, #639, #540). ## Possible Solution While it is possible to implement different strategies to deal with timeouts; there is no perfect solution. And regardless of which solution we pick we would take on a significant amount of complexity. So I believe that for Cucumber there is no good solution. However this problem has been solved by various libraries: * [JUnit 5 `Assertions.assertTimeout*`](https://junit.org/junit5/docs/5.0.1/api/org/junit/jupiter/api/Assertions.html#assertTimeout-java.time.Duration-org.junit.jupiter.api.function.Executable-) * [Awaitility](https://github.com/awaitility/awaitility) * [Guava `TimeLimiter`](https://github.com/google/guava/blob/master/guava/src/com/google/common/util/concurrent/TimeLimiter.java) So rather then keeping a poor feature alive we should recommend users to migrate to third party solutions. ## Context & Motivation Remove the eldritch horror that is `Invoker/Timeout.timeout`. While this could have been done in v4.x, v5 will use Java 8 and most of the above libraries require Java 8.

@given

It is possible to provide a timeout to step definitions, however the semantics are unreliable ( #1506). ```java @given(value = "^I have (\\d+) cukes in my belly$", timeout = 5000) public void I_have_cukes_in_my_belly(int cukes) throws Throwable { } ``` Fixes: #1694 When the step starts a long running task Cucumber will attempt to interrupt the step once the timeout period is exceeded. If the long running task ignores the interrupt Cucumber will however not stop the test. Depending on the context this behavior is either desired or undesirable. See #1506 for in detail discussion. Additionally the current implementation is complex and has been prone to failures (#1244, #1241, #811, #639, #540). While it is possible to implement different strategies to deal with timeouts; there is no perfect solution. And regardless of which solution we pick we would take on a significant amount of complexity. So I believe that for Cucumber there is no good solution. However this problem has been solved by various libraries: * [JUnit 5 `Assertions.assertTimeout*`](https://junit.org/junit5/docs/5.0.1/api/org/junit/jupiter/api/Assertions.html#assertTimeout-java.time.Duration-org.junit.jupiter.api.function.Executable-) * [Awaitility](https://github.com/awaitility/awaitility) * [Guava `TimeLimiter`](https://github.com/google/guava/blob/master/guava/src/com/google/common/util/concurrent/TimeLimiter.java) So rather then keeping a poor feature alive we should recommend users to migrate to third party solutions. Remove the eldritch horror that is `Invoker/Timeout.timeout`. While this could have been done in v4.x, v5 will use Java 8 and most of the above libraries require Java 8. (cherry picked from commit 0ee6620)

mpkorstanje added Bug 🙏 help wanted Help wanted - not prioritized by core team labels Nov 23, 2018

mpkorstanje added a commit that referenced this issue Nov 25, 2018

[Java] Improve documentation around timeout

ec3b9d2

Partial fix for #1506

mpkorstanje added this to the 5.0.0 milestone Jan 26, 2019

mpkorstanje mentioned this issue Jul 12, 2019

Deprecate timout in favour of library based solutions #1694

Closed

mpkorstanje mentioned this issue Jul 12, 2019

[Core] Remove timeout #1695

Closed

mpkorstanje closed this as completed Jul 16, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

timeout using Thread.interrupt() is unreliable #1506

timeout using Thread.interrupt() is unreliable #1506

christianhujer commented Nov 23, 2018 •

edited

Loading

mpkorstanje commented Nov 23, 2018

christianhujer commented Nov 24, 2018 •

edited

Loading

mpkorstanje commented Nov 24, 2018 •

edited

Loading

christianhujer commented Nov 24, 2018

mpkorstanje commented Nov 24, 2018 •

edited

Loading

mpkorstanje commented Nov 24, 2018 •

edited

Loading

christianhujer commented Nov 25, 2018

mpkorstanje commented Nov 25, 2018 •

edited

Loading

mpkorstanje commented Jan 26, 2019 •

edited

Loading

mpkorstanje commented Jul 12, 2019 •

edited

Loading

mpkorstanje commented Jul 12, 2019

timeout using Thread.interrupt() is unreliable #1506

timeout using Thread.interrupt() is unreliable #1506

Comments

christianhujer commented Nov 23, 2018 • edited Loading

Summary

Expected Behavior

Current Behavior

Possible Solution

Workarounds

Steps to Reproduce (for bugs)

Context & Motivation

Your Environment

mpkorstanje commented Nov 23, 2018

christianhujer commented Nov 24, 2018 • edited Loading

mpkorstanje commented Nov 24, 2018 • edited Loading

christianhujer commented Nov 24, 2018

mpkorstanje commented Nov 24, 2018 • edited Loading

mpkorstanje commented Nov 24, 2018 • edited Loading

christianhujer commented Nov 25, 2018

mpkorstanje commented Nov 25, 2018 • edited Loading

mpkorstanje commented Jan 26, 2019 • edited Loading

mpkorstanje commented Jul 12, 2019 • edited Loading

mpkorstanje commented Jul 12, 2019

christianhujer commented Nov 23, 2018 •

edited

Loading

christianhujer commented Nov 24, 2018 •

edited

Loading

mpkorstanje commented Nov 24, 2018 •

edited

Loading

mpkorstanje commented Nov 24, 2018 •

edited

Loading

mpkorstanje commented Nov 24, 2018 •

edited

Loading

mpkorstanje commented Nov 25, 2018 •

edited

Loading

mpkorstanje commented Jan 26, 2019 •

edited

Loading

mpkorstanje commented Jul 12, 2019 •

edited

Loading