-
Notifications
You must be signed in to change notification settings - Fork 10.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add uniq() to SequenceType #183
Conversation
/// Return an `Array` containing the elements of `self` | ||
@warn_unused_result | ||
public func uniq() -> [Generator.Element] { | ||
return reduce(Array<Generator.Element>()) { (arr, element) -> [Generator.Element] in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not actually the right way to do this, because it's O(n^2). See http://airspeedvelocity.net/2015/08/03/arrays-linked-lists-and-performance/ for details. You should switch to using a temporary array and a plain old for loop, despite the ugliness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rothomp3 beat me to it :) yes reduce can really suffer performance-wise. here's an elegant answer I stumbled upon a few month back (it's a global function in this case, but making it into a protocol extension is trivial)
func uniq<S: SequenceType, E: Hashable where E == S.Generator.Element>(source: S) -> [E] {
var seen = [E: Bool]()
return source.filter { seen.updateValue(true, forKey: $0) == nil }
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dictionary is probably unnecessary - there's no need to store the Bool
values. You could do the same with just a Set
:
extension SequenceType where Generator.Element : Hashable {
public func uniq() -> [Generator.Element] {
var prevs: Set<Generator.Element> = []
return filter { el in
if prevs.contains(el) { return false }
prevs.insert(el)
return true
}
}
}
Which is maybe a little more efficient. Both the Dictionary
and Set
versions translate easily into a lazy method:
extension LazySequenceType where Generator.Element : Hashable {
public func uniq() -> LazyFilterSequence<Self> {
var prevs: Set<Generator.Element> = []
return lazy.filter { el in
if prevs.contains(el) { return false }
prevs.insert(el)
return true
}
}
}
However, there's another problem with these implementations: they rely on side-effects from filter
's closure, which is generally discouraged (I think). For instance, the function would be broken if the closure was evaluated more than once, or in a different order. So it's maybe best to have a more imperative for
loop:
extension SequenceType where Generator.Element : Hashable {
public func uniq() -> [Generator.Element] {
var prevs: Set<Generator.Element> = []
var uniqs: [Generator.Element] = []
for el in self where !prevs.contains(el) {
prevs.insert(el)
uniqs.append(el)
}
return uniqs
}
}
But then the lazy version would need to be more complex as well. (You'd probably have to write a custom struct, similar to the standard library's LazyFilterSequence
. This is the kind of thing I mean)
Considering all that, I'm not sure what the function or method would add over just using the standard Set
initialiser. The order of the elements is retained, and there's a possible lazy version, but it's a lot of extra complexity for maybe not much extra functionality.
Thank you for your PR. This is a change to the library API, so it should follow the Swift Evolution Process. Please write an informal proposal and send it to [email protected]. |
Closing because this is not how we propose library changes. |
I got it. |
SR-2856: adapt dispatch_timer_set_time for CI
Fix formatting of empty type declarations.
…ials [Syntax] Rename the `as` conversion functions that incur existential conversions to `asProtocol`
No description provided.