-
Notifications
You must be signed in to change notification settings - Fork 1.2k
git: use dulwich for .gitignore #4809
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
In my tests `Git._ignore` is taking around ~65seconds for the whole test suit(~50K invocations). The reason is that `gitpython` is launching a real `git check-ignore` process, that takes time and resources. Switching to `pygit2` or `dulwich` makes it drop to ~5seconds, saving us almost a minute of the test run time and a lot of resources by not launching any processes. I've chosen `dulwich` over `pygit2` here because it is much more pythonic and as long as the performance is on-par with `pygit2` it is better for us to use `dulwich`.
@@ -50,6 +50,7 @@ def run(self): | |||
"colorama>=0.3.9", | |||
"configobj>=5.0.6", | |||
"gitpython>3", | |||
"dulwich>=0.20.6", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only problem is that both dulwich and pygit don't have wheels for 3.9, but that is solvable.
return True | ||
except GitCommandError: | ||
return False | ||
repo = Repo(self.root_dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wasteful to have a repo instance created each time here, but doing so for drop-in simplicity. Likely dulwich will replace everything except Git.clone()
in the near future and so Repo
will move to __init__
.
Indeed, looks like the tests on windows are more stable now. Will be continuing the migration for other methods in the further PRs. EDIT: still flaky https://github.com/iterative/dvc/runs/1329409665?check_suite_focus=true |
* Latest dvc 1.10 re-use dulwich for .gitignore iterative/dvc#4809 * Regression: reintroduce issue reported in iterative/dvc#2010 * Resolution: pin dvc version to 1.9.1 for now * Build image is docker:dind Alpine Linux
In my tests
Git._ignore
is taking around ~65seconds for the whole testsuit(~50K invocations). The reason is that
gitpython
is launching a realgit check-ignore
process, that takes time and resources.Switching to
pygit2
ordulwich
makes it drop to ~5seconds, saving usalmost a minute of the test run time and a lot of resources by not
launching any processes.
I've chosen
dulwich
overpygit2
here because it is much morepythonic and the performance is on-par with
pygit2
.Related to #2215
❗ I have followed the Contributing to DVC checklist.
📖 If this PR requires documentation updates, I have created a separate PR (or issue, at least) in dvc.org and linked it here.
Thank you for the contribution - we'll try to review it as soon as possible. 🙏