Skip to content

isOutboundHalfClosureEnabled closes channel before all writes are flushed #3139

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
adam-fowler opened this issue Mar 11, 2025 · 14 comments · May be fixed by #3148
Open

isOutboundHalfClosureEnabled closes channel before all writes are flushed #3139

adam-fowler opened this issue Mar 11, 2025 · 14 comments · May be fixed by #3148
Labels
kind/bug Feature doesn't work as expected. size/M Medium task. (A couple of days of work.)

Comments

@adam-fowler
Copy link
Contributor

adam-fowler commented Mar 11, 2025

Expected behavior

If I create a NIOAsyncChannel with isOutboundHalfClosureEnabled set to true. When I call outbound.finish() I expect that all writes will have been flushed before the channel is closed.

This was meant to be a workaround for the fact that NIOAsyncChannel.executeThenClose didn't flush writes, but it has the same issue.

Steps to reproduce

  1. Write small HTTP server that outputs a large response
  2. Use curl to get response.
  3. curl will report with something like curl: (18) transfer closed with 672832 bytes remaining to read

If possible, minimal yet complete reproducer code (or URL to code)

Here is a small HTTP server that returns 1000000 bytes in its response

import NIOCore
import NIOHTTP1
import NIOPosix

/// Sendable server response that doesn't use ``IOData``
public typealias SendableHTTPServerResponsePart = HTTPPart<HTTPResponseHead, ByteBuffer>

/// Channel to convert HTTPServerResponsePart to the Sendable type HBHTTPServerResponsePart
final class HTTPSendableResponseChannelHandler: ChannelOutboundHandler, RemovableChannelHandler {
    typealias OutboundIn = SendableHTTPServerResponsePart
    typealias OutboundOut = HTTPServerResponsePart

    func write(context: ChannelHandlerContext, data: NIOAny, promise: EventLoopPromise<Void>?) {
        let part = unwrapOutboundIn(data)
        switch part {
        case .head(let head):
            context.write(self.wrapOutboundOut(.head(head)), promise: promise)
        case .body(let buffer):
            context.write(self.wrapOutboundOut(.body(.byteBuffer(buffer))), promise: promise)
        case .end:
            context.writeAndFlush(self.wrapOutboundOut(.end(nil)), promise: promise)
        }
    }
}

@available(macOS 14, *)
func server() async throws {
    let asyncChannel = try await ServerBootstrap(group: MultiThreadedEventLoopGroup.singleton)
        // Specify backlog and enable SO_REUSEADDR for the server itself
        .serverChannelOption(ChannelOptions.socketOption(.so_reuseaddr), value: 1)
        .childChannelOption(ChannelOptions.allowRemoteHalfClosure, value: true)
        .bind(host: "127.0.0.1", port: 8888, serverBackPressureStrategy: nil) { channel in
            return channel.eventLoop.makeCompletedFuture {
                try channel.pipeline.syncOperations.configureHTTPServerPipeline()
                try channel.pipeline.syncOperations.addHandler(HTTPSendableResponseChannelHandler())
                return try NIOAsyncChannel<HTTPServerRequestPart, SendableHTTPServerResponsePart>(
                    wrappingChannelSynchronously: channel,
                    configuration: .init(isOutboundHalfClosureEnabled: true)
                )
            }
        }
    print("Listening on 127.0.0.1:8888")
    await withDiscardingTaskGroup { group in
        do {
            try await asyncChannel.executeThenClose { inbound in
                for try await childChannel in inbound {
                    group.addTask {
                        try? await childChannel.executeThenClose { inbound, outbound in
                            for try await part in inbound {
                                if case .end = part {
                                    let buffer = ByteBuffer(repeating: 0, count: 1_000_000)
                                    try await outbound.write(
                                        .head(
                                            .init(
                                                version: .http1_1,
                                                status: .ok,
                                                headers: ["Content-Length": "\(buffer.readableBytes)"]
                                            )
                                        )
                                    )
                                    try await outbound.write(.body(buffer))
                                    try await outbound.write(.end(nil))
                                    outbound.finish()
                                    try await childChannel.channel.closeFuture.get()
                                    break
                                }
                            }
                        }
                        print("Closed")
                    }
                }
            }
        } catch {
            print("ERROR: Waiting on child channel: \(error)")
        }
    }

}

if #available(macOS 14, *) {
    try await server()
}

SwiftNIO version/commit hash

be266a9 (latest of this morning)

System & version information

swift-driver version: 1.115.1 Apple Swift version 6.0.3 (swiftlang-6.0.3.1.10 clang-1600.0.30.1)
Target: arm64-apple-macosx15.0
Darwin Adams-MBP-M1-Max.local 24.3.0 Darwin Kernel Version 24.3.0: Thu Jan 2 20:24:16 PST 2025; root:xnu-11215.81.4~3/RELEASE_ARM64_T6000 arm64

@Lukasa
Copy link
Contributor

Lukasa commented Mar 11, 2025

Thanks for filing this Adam, I'll take a look today and see if I can figure out what's happening here, as this should work.

@Lukasa
Copy link
Contributor

Lukasa commented Mar 11, 2025

Ok, this looks like a fundamental bug. We currently process outbound close eagerly:

final override func close0(error: Error, mode: CloseMode, promise: EventLoopPromise<Void>?) {
do {
switch mode {
case .output:
if self.outputShutdown {
promise?.fail(ChannelError._outputClosed)
return
}
if self.inputShutdown {
// Escalate to full closure
self.close0(error: error, mode: .all, promise: promise)
return
}
try self.shutdownSocket(mode: mode)
// Fail all pending writes and so ensure all pending promises are notified
self.pendingWrites.failAll(error: error, close: false)
self.unregisterForWritable()
promise?.succeed(())
self.pipeline.fireUserInboundEventTriggered(ChannelEvent.outputClosed)

I think this behaviour is just buggy. We should queue up outbound closure behind the writes.

@Lukasa Lukasa added kind/bug Feature doesn't work as expected. size/M Medium task. (A couple of days of work.) labels Mar 11, 2025
@adam-fowler
Copy link
Contributor Author

I didn't expect the issue to be so low level. That's a significant change I imagine

@Lukasa
Copy link
Contributor

Lukasa commented Mar 11, 2025

Certainly a non-trivial one. My instinct is that the fix belongs in the pending writes manager.

This is a great find though, because this bug affects all kinds of programs, not just the ones that use NIOAsyncChannel. I really appreciate your hard work root-causing it @adam-fowler.

@JoachimM7
Copy link
Contributor

@Lukasa Thank you for taking care of this issue!

Adam already knew the bug, when I asked him on Discord. I ran into this issue, because I tried to make a get request with URLSession and some biger data served from Hummingbird 2. I then always got the error Error Domain=NSURLErrorDomain Code=-1005 "The network connection was lost.". Interestingly there is no issue when using HTTP1, only with HTTP2. And more interesting: If I did the same request with the application Postman, there was no issue!

If I use curl, it seems that I get all data, but there is then the error: curl: (92) HTTP/2 stream 1 was not closed cleanly: CANCEL (err 8)

Am I maybe running into a different issue?

@adam-fowler
Copy link
Contributor Author

@Lukasa Thank you for taking care of this issue!

Adam already knew the bug, when I asked him on Discord. I ran into this issue, because I tried to make a get request with URLSession and some biger data served from Hummingbird 2. I then always got the error Error Domain=NSURLErrorDomain Code=-1005 "The network connection was lost.". Interestingly there is no issue when using HTTP1, only with HTTP2. And more interesting: If I did the same request with the application Postman, there was no issue!

If I use curl, it seems that I get all data, but there is then the error: curl: (92) HTTP/2 stream 1 was not closed cleanly: CANCEL (err 8)

Am I maybe running into a different issue?

@JoachimM7 you are running into the issue that the workaround should fix.

@adam-fowler
Copy link
Contributor Author

@Lukasa The comment from @JoachimM7 about seeing this in HTTP2 got me wondering if you have wrongly diagnosed this. I've implemented the same code as described in the issue on an HTTP2 stream and get the same error message as @JoachimM7.

As I understand it you are saying above the issue is related to writes not getting flushed when a connection is closed in BaseStreamSocketChannel.swift. But closing an HTTP2 stream doesn't close the connection so any writes should get flushed and even if the connection was closed it would only happen after the GOAWAY dance which implies all writes were flushed as the GOAWAY messages got written.

@JoachimM7
Copy link
Contributor

I may add one more thing to this: When using URLSession with a delegate (for checking the progress), it seems that the connection was lost after receiving the first bytes. So I get the error HTTP load failed, 291/4196674 bytes (error code: -1005 [1:57]). The func urlSession(_ session: URLSession, dataTask: URLSessionDataTask, didReceive data: Data) in the delegate isn't even called once.

But when using curl, I do get all the data but just combined with the error curl: (92) HTTP/2 stream 1 was not closed cleanly: CANCEL (err 8)

@Lukasa
Copy link
Contributor

Lukasa commented Mar 12, 2025

I suspect there is more than one bug: it wouldn't surprise me at all if the same bug exists in the HTTP/2 stream channel.

@JoachimM7
Copy link
Contributor

@Lukasa : I don't get any issues when using HTTP1.1

I tried AsyncHTTPClient to be sure, that there is no problem with URLSession. I get the same error when I do try await response.body.collect(....) => StreamClosed(streamID: HTTP2StreamID(1), errorCode: HTTP2ErrorCode<0x8 Cancel>, location: "NIOHTTP2/HTTP2StreamChannel.swift:895")

@Lukasa
Copy link
Contributor

Lukasa commented Mar 12, 2025

That's totally fair, but with the example code above I do see an issue with HTTP/1.1, so I think it is in both places.

@JoachimM7
Copy link
Contributor

I used some breakpoints and I came to this line:

delegate.didTerminate(error: error)

It is called from here:

return .callDidTerminate(delegate)

because suspendedYields is empty.

@JoachimM7
Copy link
Contributor

I am sorry, you can forget my part. I think I found a bug in Hummingbird (hummingbird-project/hummingbird#688), so my issue isn't related here. In addition I found out, that I was wrong with my assumptions: Postman only worked because although I used https, it still went on HTTP/1.1

@adam-fowler
Copy link
Contributor Author

adam-fowler commented Mar 13, 2025

I have a fix for the HTTP2 side of things. It shouldn't really be needed as calling write should guarantee a buffer is written. The fix relies on the fact that HTTP2Channel closes a stream once it receives data with isEndStreamSet set to true. I added the following before exiting NIOAsyncChannel.executeThenClose.

// iterator is the inbound async iterator
while try await iterator.next() != nil {}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Feature doesn't work as expected. size/M Medium task. (A couple of days of work.)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants