-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Spurious ReactiveStreamsTest#testConnectionDoesNotGetClosed failure #1380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I checked the most recent build from Travis. I found this buried in the logs.
I'm not sure, but it looks like there's more than one issue going on here... |
I'm pretty sure I got the But sadly, it's completely unrelated to the |
I can get
|
I've been thinking about this failure for a few days now and still can't figure out what would be causing this. What's the exact command you're running? |
I just run main method from IntelliJ. |
Ah, there's a main method in the unit test. I was just running the test from |
Okay, I was able to reproduce the failure. I don't have a solution here yet but I think I've learned some interesting things:
Position 12000 is exactly at 12 kilobytes. Is that a magic number of a buffer somewhere important? |
Oh. Also. When the entire content fails it fails the same way. The MD5 of the bad response is the same each time. |
I've seen it happen at offset 24,000 too 12000 bytes is not 12kb :) The problem is not server side as I was able to reproduce with Jetty, Netty and Tomcat. 12000 has not be some buffer size, either in Netty or kernel but I wasn't able to spot it. It happens on OSX (my machine) and Linux (Travis). No idea what happens. Looks like buffer corruption. |
I'm inclined to say the same. The fact that it's the same failure every time on my machine is just weird though. In any event I'm skeptical that the bug is in AHC. Might be worthwhile to try playing with the underlying Netty version to see if the problem disappears. Have you done that yet? |
I can reproduce with Netty 4.1.0.Final. I didn't bother digging further.
This feature involves many libs:
Hard to figure out where issue might happen. |
Can we backport this test to an earlier version of AHC? My thought on this is that if AHC 2.0 exhibits the same bug then this isn't a regression and we should feel free to move forward releasing AHC 2.1. |
It fails too on AHC 2.0
I currently have no idea if this bug is only limited to reactivestreams based features or if it's a critical issue that just happened to be spotted here. I don't feel comfortable releasing the next stable version with such doubt lurking around. |
I don't know the codebase well enough to answer this question for sure, but the surface area of AHC itself on the critical path for this feature seems relatively small to me. The fact that it fails, too, on AHC 2.0 means that in your shoes I'd probably release the code with a Known Bug caveat in the release notes. That is just me, however. I don't have time this weekend, but perhaps next weekend I'll have time to dig in and see if I can play with dependencies and determine whether or not changing netty versions and such causes the problem to disappear. |
@farmdawgnation Please give it a try. I'm pretty sure it was a RxJava 1 bug that was fixed in RxJava 2. There are other foobar stuffs related to reactive streams that I want to clean up before releasing 2.1. I'll work on this when I'll be back from vacation. |
Huzzah! It seems to resolved the issue on my end, too! |
Great, thanks for the feedback! |
Issue still happens almost every time on Travis. I think Subscription is being sometimes requested from 2 different threads: from the calling thread and from the channel's event loop. When such switch happen, I think chunks can be written out of order as the ones read from the calling thread have to jump into the event loop. |
Reproducer:
org.asynchttpclient.reactivestreams.ReactiveStreamsTest#main
(might have to run several times):The text was updated successfully, but these errors were encountered: