Skip to content
This repository was archived by the owner on Sep 11, 2020. It is now read-only.

plumbing: use seen map in tree walker #564

Merged
merged 1 commit into from
Aug 28, 2017

Conversation

strib
Copy link
Contributor

@strib strib commented Aug 25, 2017

This helps avoid iterating down the same trees for every commit. For a big-ish repo with 35K objects (17K commits), this reduced the time for calling revlist.Objects during a push (with 0 hashes to ignore) from more than ten minutes to less than a minute.

@strib strib force-pushed the tree-walker-use-seen-cache branch from 498dc32 to 52b25c7 Compare August 25, 2017 23:34
@codecov
Copy link

codecov bot commented Aug 25, 2017

Codecov Report

Merging #564 into master will decrease coverage by 0.61%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #564      +/-   ##
==========================================
- Coverage      78%   77.38%   -0.62%     
==========================================
  Files         129      129              
  Lines        9842     9845       +3     
==========================================
- Hits         7677     7619      -58     
- Misses       1327     1400      +73     
+ Partials      838      826      -12
Impacted Files Coverage Δ
plumbing/object/treenoder.go 80.35% <100%> (ø) ⬆️
plumbing/object/file.go 75% <100%> (ø) ⬆️
plumbing/object/tree.go 76.88% <100%> (+0.35%) ⬆️
plumbing/revlist/revlist.go 75.64% <100%> (-2.57%) ⬇️
plumbing/transport/ssh/common.go 20.54% <0%> (-45.21%) ⬇️
plumbing/transport/ssh/auth_method.go 31.57% <0%> (-22.81%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 631a45f...d68f45f. Read the comment docs.

@@ -297,9 +297,10 @@ func (iter *treeEntryIter) Next() (TreeEntry, error) {

// TreeWalker provides a means of walking through all of the entries in a Tree.
type TreeWalker struct {
stack []treeEntryIter
stack []*treeEntryIter
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems safer to me; I was worried things that accessed the stack might end up calling Next on a copy, rather than the real object. But the way the stack is used now, I don't think it's a real bug. If you want me to revert it, let me know.

strib added a commit to keybase/kbfs that referenced this pull request Aug 25, 2017
Compared to the v4 branch we were using before, master branch seems to
fix a few bugs and bring the time for pushing the KBFS repo from hours
to about 25 minutes.  With the following optimizations, I got that
time down to less than 3 minutes.

* src-d/go-git#564
* src-d/go-git#565

I suspect there are still lots more optimizations that can be done.  I
don't think there are many users of Push in this project.
strib added a commit to keybase/kbfs that referenced this pull request Aug 26, 2017
Compared to the v4 branch we were using before, master branch seems to
fix a few bugs and bring the time for pushing the KBFS repo from hours
to about 25 minutes.  With the following optimizations, I got that
time down to less than 3 minutes.

* src-d/go-git#564
* src-d/go-git#565

I suspect there are still lots more optimizations that can be done.  I
don't think there are many users of Push in this project.
strib added a commit to keybase/kbfs that referenced this pull request Aug 26, 2017
Compared to the v4 branch we were using before, master branch seems to
fix a few bugs and bring the time for pushing the KBFS repo from hours
to about 25 minutes.  With the following optimizations, I got that
time down to less than 3 minutes.

* src-d/go-git#564
* src-d/go-git#565

I suspect there are still lots more optimizations that can be done.  I
don't think there are many users of Push in this project.
strib added a commit to keybase/kbfs that referenced this pull request Aug 26, 2017
Compared to the v4 branch we were using before, master branch seems to
fix a few bugs and bring the time for pushing the KBFS repo from hours
to about 25 minutes.  With the following optimizations, I got that
time down to less than 3 minutes.

* src-d/go-git#564
* src-d/go-git#565

I suspect there are still lots more optimizations that can be done.  I
don't think there are many users of Push in this project.
@mcuadros mcuadros requested a review from erizocosmico August 27, 2017 21:48
Copy link
Contributor

@erizocosmico erizocosmico left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job! Looks good to me, just one small request: can we add a test for Next that ensures that hashes in seen are actually skipped?

This helps avoids iterating down the same trees for every commit.  For
a big-ish repo with 35K objects (17K commits), this reduced the time
for calling `revlist.Objects` during a push (with 0 hashes to ignore)
from more than ten minutes to less than a minute.
@strib strib force-pushed the tree-walker-use-seen-cache branch from 52b25c7 to d68f45f Compare August 28, 2017 00:15
@strib
Copy link
Contributor Author

strib commented Aug 28, 2017

Thanks for looking @erizocosmico. Now there's a simple test.

Copy link
Contributor

@erizocosmico erizocosmico left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks for the contribution!

@mcuadros mcuadros merged commit 96286b0 into src-d:master Aug 28, 2017
@strib strib deleted the tree-walker-use-seen-cache branch August 28, 2017 16:59
strib added a commit to keybase/kbfs that referenced this pull request Aug 28, 2017
Compared to the v4 branch we were using before, master branch seems to
fix a few bugs and bring the time for pushing the KBFS repo from hours
to about 25 minutes.  With the following optimizations, I got that
time down to less than 3 minutes.

* src-d/go-git#564
* src-d/go-git#565

I suspect there are still lots more optimizations that can be done.  I
don't think there are many users of Push in this project.
strib added a commit to keybase/kbfs that referenced this pull request Aug 28, 2017
Compared to the v4 branch we were using before, master branch seems to
fix a few bugs and bring the time for pushing the KBFS repo from hours
to about 25 minutes.  With the following optimizations, I got that
time down to less than 3 minutes.

* src-d/go-git#564
* src-d/go-git#565

I suspect there are still lots more optimizations that can be done.  I
don't think there are many users of Push in this project.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants