Skip to content

Sub-optimal performance on remove during resize #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jonhoo opened this issue Jun 30, 2020 · 3 comments · Fixed by #12
Closed

Sub-optimal performance on remove during resize #1

jonhoo opened this issue Jun 30, 2020 · 3 comments · Fixed by #12
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@jonhoo
Copy link
Owner

jonhoo commented Jun 30, 2020

When a remove happens during the resize period, we need to restart the RawIter we have over the old table to track which elements have yet to be removed. This restart happens here:

griddle/src/lib.rs

Lines 865 to 871 in b2a063c

// By changing the state of the table, we have invalidated the table iterator
// we keep for what elements are left to move. So, we re-compute it.
//
// TODO: We should be able to "fix up" the iterator rather than replace it,
// which would save us from iterating over the prefix of empty buckets we've
// left in our wake from the moves so far.
lo.items = lo.table.iter();

Unfortunately, when the iterator is restarted this way, it has to walk all the empty buckets of elements we have already moved the next time we try to move a key. When the old map is large, this can take a significant amount of time.

It would be better if we could "refresh" the iterator instead. This should be possible, since we know the old table has neither shrunk nor grown. I probably requires some changes internally in hashbrown. I pinged the hashbrown author about it in rust-lang/hashbrown#166 (comment).

@Amanieu
Copy link

Amanieu commented Jun 30, 2020

Erasing an element doesn't actually invalidate a RawIter as long as the element you removes is "behind" the iterator. This should probably be documented better, but if you look at retain you can see that is how it works.

@jonhoo
Copy link
Owner Author

jonhoo commented Jun 30, 2020

Ah, I think maybe some context is needed here.

The iterator griddle keeps is an iterator over the "old" table of elements that have not yet been moved. We keep that iterator around even while other operations are performed (like removals) so that the next insert can immediately get the next item to move without having to scan for non-empty buckets. There are no elements left "before" the iterator, as they have all been moved.

The issue here is that the user calls remove on the top-level map, which then ends up removing an upcoming element from the cached iterator (potentially even in its current group). At the very least, the items counter in the RawIter needs to be decremented, but I also experienced that the cached iterator would yield empty buckets for the items that have been removed since when it was created. Basically, this code:

griddle/src/lib.rs

Lines 783 to 787 in 91b0f5b

if let Some(e) = lo.items.next() {
// We need to remove the item in this bucket from the old map
// to the resized map, without shrinking the old map.
let (k, v) = unsafe {
lo.table.erase_no_drop(&e);

Triggered this assertion:
https://github.com/rust-lang/hashbrown/blob/efbd03661fc9df0e594ecc89784107353cc71060/src/raw/mod.rs#L491

@jonhoo
Copy link
Owner Author

jonhoo commented Jun 30, 2020

I think a "general" refresh is probably too tricky to add (e.g., you wouldn't know how to change items without re-scanning), but maybe a notify_remove_ahead method that decrements by 1 and refreshes the current iterator group?

jonhoo added a commit that referenced this issue Jul 1, 2020
This fixes both #1 and #2 by taking advantage of
rust-lang/hashbrown#167.

It won't work until that PR is merged, and also requires an
implementation of `Clone` for `RawIntoIter`.
jonhoo added a commit that referenced this issue Oct 19, 2024
…odecov-action-3

Bump codecov/codecov-action from 2 to 3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
2 participants