Skip to content

Add partitioned(by:) #152

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Oct 20, 2021
Merged
Show file tree
Hide file tree
Changes from 12 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 24 additions & 1 deletion Guides/Partition.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,20 @@ let p = numbers.partitioningIndex(where: { $0.isMultiple(of: 20) })
// numbers[p...] = [20, 40, 60]
```

The standard library’s existing `filter(_:)` method provides functionality to
get the elements that do match a given predicate. `partitioned(_:)` returns
both the elements that match the preciate as well as those that don’t, as a
tuple.

```swift
let cast = ["Vivien", "Marlon", "Kim", "Karl"]
let (longNames, shortNames) = cast.partitioned({ $0.count < 5 })
print(longNames)
// Prints "["Vivien", "Marlon"]"
print(shortNames)
// Prints "["Kim", "Karl"]"
```

## Detailed Design

All mutating methods are declared as extensions to `MutableCollection`.
Expand Down Expand Up @@ -69,11 +83,17 @@ extension Collection {
where belongsInSecondPartition: (Element) throws -> Bool
) rethrows -> Index
}

extension Sequence {
public func partitioned(
_ belongsInSecondCollection: (Element) throws -> Bool
) rethrows -> ([Element], [Element])
}
```

### Complexity

The existing partition is an O(_n_) operations, where _n_ is the length of the
The existing partition is an O(_n_) operation, where _n_ is the length of the
range to be partitioned, while the stable partition is O(_n_ log _n_). Both
partitions have algorithms with improved performance for bidirectional
collections, so it would be ideal for those to be customization points were they
Expand All @@ -82,6 +102,9 @@ to eventually land in the standard library.
`partitioningIndex(where:)` is a slight generalization of a binary search, and
is an O(log _n_) operation for random-access collections; O(_n_) otherwise.

`partitioned(_:)` is an O(_n_) operation, where _n_ is the number of elements in
the original sequence.

### Comparison with other languages

**C++:** The `<algorithm>` library defines `partition`, `stable_partition`, and
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ Read more about the package, and the intent behind it, in the [announcement on s
#### Subsetting operations

- [`compacted()`](https://github.com/apple/swift-algorithms/blob/main/Guides/Compacted.md): Drops the `nil`s from a sequence or collection, unwrapping the remaining elements.
- [`partitioned(_:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/Partition.md): Returns the elements in a sequence or collection that do and do not match a given predicate.
- [`randomSample(count:)`, `randomSample(count:using:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/RandomSampling.md): Randomly selects a specific number of elements from a collection.
- [`randomStableSample(count:)`, `randomStableSample(count:using:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/RandomSampling.md): Randomly selects a specific number of elements from a collection, preserving their original relative order.
- [`striding(by:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/Stride.md): Returns every nth element of a collection.
Expand Down
143 changes: 142 additions & 1 deletion Sources/Algorithms/Partition.swift
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
//
// This source file is part of the Swift Algorithms open source project
//
// Copyright (c) 2020 Apple Inc. and the Swift project authors
// Copyright (c) 2021 Apple Inc. and the Swift project authors
// Licensed under Apache License v2.0 with Runtime Library Exception
//
// See https://swift.org/LICENSE.txt for license information
Expand Down Expand Up @@ -204,3 +204,144 @@ extension Collection {
}
}

//===----------------------------------------------------------------------===//
// partitioned(_:)
//===----------------------------------------------------------------------===//

extension Sequence {
/// Returns two arrays containing, in order, the elements of the sequence that
/// do and don’t satisfy the given predicate, respectively.
///
/// In this example, `partitioned(_:)` is used to separate the input based on
/// names that aren’t and are shorter than five characters, respectively:
///
/// let cast = ["Vivien", "Marlon", "Kim", "Karl"]
/// let (longNames, shortNames) = cast.partitioned({ $0.count < 5 })
/// print(longNames)
/// // Prints "["Vivien", "Marlon"]"
/// print(shortNames)
/// // Prints "["Kim", "Karl"]"
///
/// - Parameter belongsInSecondCollection: A closure that takes an element of
/// the sequence as its argument and returns a Boolean value indicating
/// whether the element should be included in the second returned array.
/// Otherwise, the element will appear in the first returned array.
///
/// - Returns: Two arrays with with all of the elements of the receiver. The
/// first array contains all the elements that `belongsInSecondCollection`
/// didn’t allow, and the second array contains all the elements that
/// `belongsInSecondCollection` allowed.
///
/// - Complexity: O(*n*), where *n* is the length of the sequence.
///
/// - Note: This algorithm performs a bit slower than the same algorithm on
/// `RandomAccessCollection` since the size of the sequence is unknown, unlike
/// `RandomAccessCollection`.
@inlinable
public func partitioned(
_ belongsInSecondCollection: (Element) throws -> Bool
) rethrows -> ([Element], [Element]) {
var lhs = [Element]()
var rhs = [Element]()

for element in self {
if try belongsInSecondCollection(element) {
rhs.append(element)
} else {
lhs.append(element)
}
}

return (lhs, rhs)
}
}

extension Collection {
/// Returns two arrays containing, in order, the elements of the collection
/// that do and don’t satisfy the given predicate, respectively.
///
/// In this example, `partitioned(_:)` is used to separate the input based on
/// names that aren’t and are shorter than five characters, respectively:
///
/// let cast = ["Vivien", "Marlon", "Kim", "Karl"]
/// let (longNames, shortNames) = cast.partitioned({ $0.count < 5 })
/// print(longNames)
/// // Prints "["Vivien", "Marlon"]"
/// print(shortNames)
/// // Prints "["Kim", "Karl"]"
///
/// - Parameter belongsInSecondCollection: A closure that takes an element of
/// the collection as its argument and returns a Boolean value indicating
/// whether the element should be included in the second returned array.
/// Otherwise, the element will appear in the first returned array.
///
/// - Returns: Two arrays with with all of the elements of the receiver. The
/// first array contains all the elements that `belongsInSecondCollection`
/// didn’t allow, and the second array contains all the elements that
/// `belongsInSecondCollection` allowed.
///
/// - Complexity: O(*n*), where *n* is the length of the collection.
///
/// - Note: This function on `Collection` performs a bit faster than the
/// `Sequence` implementation. Since the size of the collection, and therefore
/// the total size of the two returned arrays, is known ahead of time, it can
/// avoid array resizing.
@inlinable
public func partitioned(
_ belongsInSecondCollection: (Element) throws -> Bool
) rethrows -> ([Element], [Element]) {
guard !self.isEmpty else {
return ([], [])
}

// Since `RandomAccessCollection`s have known sizes (access to `count` is
// constant time, O(1)), we can allocate one array of size `self.count`,
// then insert items at the beginning or end of that contiguous block. This
// way, we don’t have to do any dynamic array resizing. Since we insert the
// right elements on the right side in reverse order, we need to reverse
// them back to the original order at the end.

let count = self.count

// Inside of the `initializer` closure, we set what the actual mid-point is.
// We will use this to partitioned the single array into two in constant time.
var midPoint: Int = 0

let elements = try [Element](
unsafeUninitializedCapacity: count,
initializingWith: { buffer, initializedCount in
var lhs = buffer.baseAddress!
var rhs = lhs + buffer.count
do {
for element in self {
if try belongsInSecondCollection(element) {
rhs -= 1
rhs.initialize(to: element)
} else {
lhs.initialize(to: element)
lhs += 1
}
}

let rhsIndex = rhs - buffer.baseAddress!
buffer[rhsIndex...].reverse()
initializedCount = buffer.count

midPoint = rhsIndex
} catch {
let lhsCount = lhs - buffer.baseAddress!
let rhsCount = (buffer.baseAddress! + buffer.count) - rhs
buffer.baseAddress!.deinitialize(count: lhsCount)
rhs.deinitialize(count: rhsCount)
throw error
}
})

let lhs = elements[..<midPoint]
let rhs = elements[midPoint...]
return (
Array(lhs),
Array(rhs)
)
}
}
36 changes: 36 additions & 0 deletions Tests/SwiftAlgorithmsTests/PartitionTests.swift
Original file line number Diff line number Diff line change
Expand Up @@ -133,4 +133,40 @@ final class PartitionTests: XCTestCase {
}
}
}

func testPartitionedWithEmptyInput() {
let input: [Int] = []

let s0 = input.partitioned({ _ in return true })

XCTAssertTrue(s0.0.isEmpty)
XCTAssertTrue(s0.1.isEmpty)
}

/// Test the example given in the `partitioned(_:)` documentation
func testPartitionedExample() throws {
let cast = ["Vivien", "Marlon", "Kim", "Karl"]
let (longNames, shortNames) = cast.partitioned({ $0.count < 5 })
XCTAssertEqual(longNames, ["Vivien", "Marlon"])
XCTAssertEqual(shortNames, ["Kim", "Karl"])
}

func testPartitionedWithPredicate() throws {
let s0 = ["A", "B", "C", "D"].partitioned({ $0 == $0.lowercased() })
let s1 = ["a", "B", "C", "D"].partitioned({ $0 == $0.lowercased() })
let s2 = ["a", "B", "c", "D"].partitioned({ $0 == $0.lowercased() })
let s3 = ["a", "B", "c", "d"].partitioned({ $0 == $0.lowercased() })

XCTAssertEqual(s0.0, ["A", "B", "C", "D"])
XCTAssertEqual(s0.1, [])

XCTAssertEqual(s1.0, ["B", "C", "D"])
XCTAssertEqual(s1.1, ["a"])

XCTAssertEqual(s2.0, ["B", "D"])
XCTAssertEqual(s2.1, ["a", "c"])

XCTAssertEqual(s3.0, ["B"])
XCTAssertEqual(s3.1, ["a", "c", "d"])
}
}