-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Add AsyncBytes, AsyncLineSequence, AsyncCharacterSequence, and AsyncUnicodeScalarSequence #3036
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Catfish-Man
wants to merge
10
commits into
main
Choose a base branch
from
a-sink-eye-oh
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
4860800
Add AsyncBytes, AsyncLineSequence, AsyncCharacterSequence, and AsyncU…
Catfish-Man 4fb48c4
Apply suggestions from code review
Catfish-Man 557cf62
Merge branch 'main' into a-sink-eye-oh
millenomi f73d3dd
throws async -> async throws
millenomi bd0c1e6
Add missing brace.
millenomi f0c813e
Fix missing return and class name collision.
millenomi 2b57290
Add missing availability.
millenomi a061294
Remove calls; these are passed as closures.
millenomi c812452
Include new unit tests in the test list.
millenomi f15da19
Experiment in avoiding the macOS error by using explicit availability.
millenomi File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
//===----------------------------------------------------------------------===// | ||
// | ||
// This source file is part of the Swift.org open source project | ||
// | ||
// Copyright (c) 2021 Apple Inc. and the Swift project authors | ||
// Licensed under Apache License v2.0 with Runtime Library Exception | ||
// | ||
// See https://swift.org/LICENSE.txt for license information | ||
// See https://swift.org/CONTRIBUTORS.txt for the list of Swift project authors | ||
// | ||
//===----------------------------------------------------------------------===// | ||
|
||
@available(macOS 12.0, iOS 15.0, tvOS 15.0, watchOS 8.0, *) | ||
public struct AsyncCharacterSequence<Base: AsyncSequence>: AsyncSequence where Base.Element == UInt8 { | ||
public typealias Element = Character | ||
|
||
var underlying: AsyncUnicodeScalarSequence<Base> | ||
|
||
@frozen | ||
public struct AsyncIterator: AsyncIteratorProtocol { | ||
@usableFromInline var remaining: AsyncUnicodeScalarSequence<Base>.AsyncIterator | ||
@usableFromInline var accumulator = "" | ||
|
||
@inlinable @inline(__always) | ||
public mutating func next() async rethrows -> Character? { | ||
while let scalar = try await remaining.next() { | ||
accumulator.unicodeScalars.append(scalar) | ||
if accumulator.count > 1 { | ||
return accumulator.removeFirst() | ||
} | ||
} | ||
return accumulator.count > 0 ? accumulator.removeFirst() : nil | ||
} | ||
} | ||
|
||
public func makeAsyncIterator() -> AsyncIterator { | ||
return AsyncIterator(remaining: underlying.makeAsyncIterator()) | ||
} | ||
|
||
internal init(underlyingSequence: Base) { | ||
underlying = AsyncUnicodeScalarSequence(underlyingSequence: underlyingSequence) | ||
} | ||
} | ||
|
||
@available(macOS 12.0, iOS 15.0, tvOS 15.0, watchOS 8.0, *) | ||
public extension AsyncSequence where Self.Element == UInt8 { | ||
/** | ||
A non-blocking sequence of `Characters` created by decoding the elements of `self` as UTF8. | ||
*/ | ||
var characters: AsyncCharacterSequence<Self> { | ||
AsyncCharacterSequence(underlyingSequence: self) | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,166 @@ | ||
//===----------------------------------------------------------------------===// | ||
// | ||
// This source file is part of the Swift.org open source project | ||
// | ||
// Copyright (c) 2021 Apple Inc. and the Swift project authors | ||
// Licensed under Apache License v2.0 with Runtime Library Exception | ||
// | ||
// See https://swift.org/LICENSE.txt for license information | ||
// See https://swift.org/CONTRIBUTORS.txt for the list of Swift project authors | ||
// | ||
//===----------------------------------------------------------------------===// | ||
|
||
@available(macOS 12.0, iOS 15.0, tvOS 15.0, watchOS 8.0, *) | ||
public struct AsyncLineSequence<Base: AsyncSequence>: AsyncSequence where Base.Element == UInt8 { | ||
public typealias Element = String | ||
|
||
var base: Base | ||
|
||
@available(macOS 12.0, iOS 15.0, tvOS 15.0, watchOS 8.0, *) | ||
public struct AsyncIterator: AsyncIteratorProtocol { | ||
public typealias Element = String | ||
|
||
var byteSource: Base.AsyncIterator | ||
var buffer: Array<UInt8> = [] | ||
var leftover: UInt8? = nil | ||
|
||
internal init(underlyingIterator: Base.AsyncIterator) { | ||
byteSource = underlyingIterator | ||
} | ||
|
||
// We'd like to reserve flexibility to improve the implementation of | ||
// next() in the future, so aren't marking it @inlinable. Manually | ||
// specializing for the common source types helps us get back some of | ||
// the performance we're leaving on the table. | ||
@_specialize(where Base == URL.AsyncBytes) | ||
@_specialize(where Base == FileHandle.AsyncBytes) | ||
//@_specialize(where Base == URLSession.AsyncBytes) | ||
public mutating func next() async rethrows -> String? { | ||
/* | ||
0D 0A: CR-LF | ||
0A | 0B | 0C | 0D: LF, VT, FF, CR | ||
E2 80 A8: U+2028 (LINE SEPARATOR) | ||
E2 80 A9: U+2029 (PARAGRAPH SEPARATOR) | ||
*/ | ||
let _CR: UInt8 = 0x0D | ||
let _LF: UInt8 = 0x0A | ||
let _NEL_PREFIX: UInt8 = 0xC2 | ||
let _NEL_SUFFIX: UInt8 = 0x85 | ||
let _SEPARATOR_PREFIX: UInt8 = 0xE2 | ||
let _SEPARATOR_CONTINUATION: UInt8 = 0x80 | ||
let _SEPARATOR_SUFFIX_LINE: UInt8 = 0xA8 | ||
let _SEPARATOR_SUFFIX_PARAGRAPH: UInt8 = 0xA9 | ||
|
||
func yield() -> String? { | ||
defer { | ||
buffer.removeAll(keepingCapacity: true) | ||
} | ||
if buffer.isEmpty { | ||
return nil | ||
} | ||
return String(decoding: buffer, as: UTF8.self) | ||
} | ||
|
||
func nextByte() async throws -> UInt8? { | ||
defer { leftover = nil } | ||
if let leftover = leftover { | ||
return leftover | ||
} | ||
return try await byteSource.next() | ||
} | ||
|
||
while let first = try await nextByte() { | ||
switch first { | ||
case _CR: | ||
let result = yield() | ||
// Swallow up any subsequent LF | ||
guard let next = try await byteSource.next() else { | ||
return result //if we ran out of bytes, the last byte was a CR | ||
} | ||
if next != _LF { | ||
leftover = next | ||
} | ||
if let result = result { | ||
return result | ||
} | ||
continue | ||
case _LF..<_CR: | ||
guard let result = yield() else { | ||
continue | ||
} | ||
return result | ||
case _NEL_PREFIX: // this may be used to compose other UTF8 characters | ||
guard let next = try await byteSource.next() else { | ||
// technically invalid UTF8 but it should be repaired to "\u{FFFD}" | ||
buffer.append(first) | ||
return yield() | ||
} | ||
if next != _NEL_SUFFIX { | ||
buffer.append(first) | ||
buffer.append(next) | ||
} else { | ||
guard let result = yield() else { | ||
continue | ||
} | ||
return result | ||
} | ||
case _SEPARATOR_PREFIX: | ||
// Try to read: 80 [A8 | A9]. | ||
// If we can't, then we put the byte in the buffer for error correction | ||
guard let next = try await byteSource.next() else { | ||
buffer.append(first) | ||
return yield() | ||
} | ||
guard next == _SEPARATOR_CONTINUATION else { | ||
buffer.append(first) | ||
buffer.append(next) | ||
continue | ||
} | ||
guard let fin = try await byteSource.next() else { | ||
buffer.append(first) | ||
buffer.append(next) | ||
return yield() | ||
|
||
} | ||
guard fin == _SEPARATOR_SUFFIX_LINE || fin == _SEPARATOR_SUFFIX_PARAGRAPH else { | ||
buffer.append(first) | ||
buffer.append(next) | ||
buffer.append(fin) | ||
continue | ||
} | ||
if let result = yield() { | ||
return result | ||
} | ||
continue | ||
default: | ||
buffer.append(first) | ||
} | ||
} | ||
// Don't emit an empty newline when there is no more content (e.g. end of file) | ||
if !buffer.isEmpty { | ||
return yield() | ||
} | ||
return nil | ||
} | ||
|
||
} | ||
|
||
public func makeAsyncIterator() -> AsyncIterator { | ||
return AsyncIterator(underlyingIterator: base.makeAsyncIterator()) | ||
} | ||
|
||
internal init(underlyingSequence: Base) { | ||
base = underlyingSequence | ||
} | ||
} | ||
|
||
@available(macOS 12.0, iOS 15.0, tvOS 15.0, watchOS 8.0, *) | ||
public extension AsyncSequence where Self.Element == UInt8 { | ||
/** | ||
A non-blocking sequence of newline-separated `Strings` created by decoding the elements of `self` as UTF8. | ||
*/ | ||
@available(macOS 12.0, iOS 15.0, tvOS 15.0, watchOS 8.0, *) | ||
var lines: AsyncLineSequence<Self> { | ||
AsyncLineSequence(underlyingSequence: self) | ||
} | ||
} |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgive me if I'm oversimplifying... there's a lot going on here. 😊 It looks like this completely removes empty lines, which seems suboptimal, given that...
isEmpty
filter on the sequence if you really do want to strip them out.Based on this, the default behavior should probably be to include empty lines (i.e., change
continue
toreturn ""
in the_CR
and_LF
cases).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If in doubt, please cross-check with Foundation's behavior. It needs to have the same semantics, else we widen the (already large) gap between APPLE and !APPLE.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yeah, good point. This does match Foundation's behavior, so the issue would be against Apple, not this code. Thanks for the check!