-
Notifications
You must be signed in to change notification settings - Fork 286
Gitoxide for performance improvements #635
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
It's using the latest version from github for now until it's clear what should be published later - gitoxide might change to provide the best possible experience.
…necessary work Now we only create as many authors as we need to avoid allocating more strings then needed when converting BString to its display representation.
It's actually quite involved to get proper short-ids, and git even applies some heuristics to calculate good starting values for the abbreviation length, right before validating uniqueness.
f372e67
to
4fc3334
Compare
This can take a long time on massive repos.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for all your hard work so far! I have a couple more questions:
… identity Additionally we sort contributors by name in case they have the same amount of commits to assure stable results, which may help in some cases where similar commits counts would be displayed as they are in the top 3.
We cache the lower-case value only if it is different from what's there to speed up comparisons while using memory/allocations only when needed.
I think it's ready for a formal review. Thanks. |
`git shortlog -sne | wc -l` is the reference.
…leading to it being displayed twice.
87b020e
to
eb753f9
Compare
I took a look at what mainline does when displaying shallow clones and noticed that it displays it as Hopefully this is something you find as valuable as me, please let me know what you think. Edit: Here is how it looks like. |
That way apps using it won't be surprised by default, but instead can upgrade to deal even better with shallow clones by using the data provided in the iterator.
That is a good idea! Shallow clones can misrepresent the commit count, after all. Issue #592 may also be relevant -- it seems to be another way that a user might truncate the history. I'd consider this a new feature besides the goal performance improvement. Would you be willing to keep a log of additional bugfixes/features in this PR's description? That way we can make sure to give you credit for all your improvements in the next release 🙂 ## Additional Changes
- detect shallow clones |
@spenserblack Thanks so much for pointing me to this issue. Even though I stumbled across the graft/replace feature when studying git's source I could never make sense of it. That changed now and I will implement support for it in @o2sh Thanks for the issues, I created a new section in the PR body to track those. Besides that, I think all review comments are now addressed 🎉, but I will let you know separately once I think it's ready for a final look. |
Many thanks for the hint about replacement objects, thanks to that This means that from my side, this PR is now feature complete and I am looking forward to your feedback to push it over the finishing line. |
I've just approved the PR. Prior to that, don't forget to:
@spenserblack, do we keep the Thanks again for your contribution @Byron, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your hard work maintaining gitoxide
and making this PR!
These are just some comments to think about. The only one that's somewhat high-priority is the BUG
expect
s, but overall LGTM.
Ideally we add more test coverage so users have no chance from ever seeing it.
Previously this wasn't possible as commits would be kept in `Repo` which would cause self-referential borrow check issues unless the git2 repository was kept outside.
You are welcome, it's a pleasure :). Thanks for all your suggestions, you find them implemented in the new commits. I will let it settle for a day or so in case other bits and pieces come up, and go ahead with the merge otherwise. |
And it's done 🎉! Thanks everyone for their suggestions and comments! I can't wait to finalize the transition to |
This is a very first preview of what it would mean to use gitoxide for commit-traversal alone, done with the smallest possible changes for everything to remain as familiar as possible. It ain't super pretty yet, but it will get there. My rough plan is to get feedback early and ultimately squash all/most commits once a merge is possible.
Furthermore I think a delayed progress bar could easily be implemented to provide some entertainment while people are waiting for their huge repositories (those will never finish below 1s no matter how hard we try 😅).
Right now it's 2.2x faster on reactos and 2.4x faster on the linux kernel at 4% the heap memory consumption compared to what's on
main
.I am looking forward to your feedback.
Making-of Video
gitoxide
to make this PR possible.Additional Changes
git replace
) #592)Tasks
git-mailmap
gitoxide
for commit graph traversalSig
gitoxide
in all possible places to validate the API is en-par withgit2
or more convenientgit2
, but we are working on it, this should be ready this year.gitoxide
and switch to usingcrates.io
Performance
e3b29b0 - commit traversal with
gitoxide
0652bbe - minimize allocations
633f0ce - parallelize workspace status, tokei, and commit traversal
Performance comparison with
main
On 633f0ce
reactos
linux v5.16
On 0652bbe
reactos
** linux**
On e3b29b0
reactos
linux v5.16
It also uses less memory (already)
Note that the contributor count is off by 2 - it's likely to be related to the mailmap, and for I don't know if
libgit2
is off orgitoxide
. I will look into it.