Skip to content

Sha 256 support #281

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
3 of 9 tasks
Byron opened this issue Dec 21, 2021 · 5 comments
Open
3 of 9 tasks

Sha 256 support #281

Byron opened this issue Dec 21, 2021 · 5 comments
Labels
C-tracking-issue An issue to track to track the progress of multiple PRs or issues help wanted Extra attention is needed

Comments

@Byron
Copy link
Member

Byron commented Dec 21, 2021

Even though the foundation is set, it needs another push to actually make it work with different kinds of hashes.

Tasks

  • remove hash-type specific methods from git-hash and replace them with parametric usage of git_hash::Kind
  • all code assuming hashes of len 20 should receive this value as parameter instead. This is what git does for the old index and pack file formats.
  • a way to pass --object-hash information to the gix CLI
  • remove SHA1 mention from git-features feature toggles
  • parameterize hash len when decoding non-blob objects (see this for an example)
  • understand and implement pack idx V3. - see if git actually implements this, and maybe decide that gitoxide won't handle the transition period, is either one has or another.
  • add new Sha256 enum variant, consider putting it behind a feature flag, and add a hasher for it as well.
  • general tests for reading refs and objects of different len
  • tests for writing and reading objects of different len, maybe even write a conversion program which transforms an entire repo and double-checks with git-fsck
  • when cloning, check the uninmplemented!() invocation to configure the repo for expecting a different hash

Implementation ideas

  • make sure once Sha256 is added as ObjectId variant, that it's behind a feature toggle to allow builds that opt-out of SHA256 support to not unnecessarily use more memory than needed. Maybe there are alternatives to this, too.
  • One way to do that with approximately zero overhead would be to such functions generic on the object ID, using a trait that has a method to get the type. Then object IDs with a known type return a constant from that method, and object IDs with a runtime dispatched type return the value of that enum.

    • @joshtriplett - taken verbatim as I'd barely be able to improve on it when paraphrasing. In short, have a trait for oid or allow efficient conversions to oid (it's just a slice, so that should work for specifically sized types as well especially if these were provided by git-hash.

Notes

  • find ways to use the existing highly-parallel pack traversal (along with integration of loose-objects) to build an inverse-ref table to quickly traverse objects bottom-up to change the hash used along with all references, while being fast. This ties into being able to build new packs quickly, ideally even with delta-compression (the latter then has to be re-created as most objects actually change) - re-using deltas for blobs is the only way.
    • The existing traversal can mutate data in the tree, which is enough to decode the object and keep direct references for later.
@Byron Byron added the C-tracking-issue An issue to track to track the progress of multiple PRs or issues label Jan 23, 2022
@shea256
Copy link
Contributor

shea256 commented Dec 20, 2022

+1 to this feature. Excited to see this implemented.

@Byron Byron mentioned this issue Mar 15, 2023
@msrd0
Copy link

msrd0 commented May 16, 2024

+1. Given that git no longer considers sha256 support experimental and forges such as forgejo added support, this has became more relevant for new git repositories.

@D4V1D3-008
Copy link

+1. Without sha256 support Starship prompt cannot detect repositories using that algorithm.

@jokeyrhyme
Copy link

@Byron

make sure once Sha256 is added as ObjectId variant, that it's behind a feature toggle to allow builds that opt-out of SHA256 support to not unnecessarily use more memory than needed. Maybe there are alternatives to this, too.

should the old SHA1 support also be put behind a feature toggle (enabled by default), so that builds can choose to only support SHA256 ?

@Byron Byron added the help wanted Extra attention is needed label Mar 24, 2025
@Byron
Copy link
Member Author

Byron commented Mar 24, 2025

+1. Without sha256 support Starship prompt cannot detect repositories using that algorithm.

Starship would definitely be a great motivation to finally go through with this.

gix-hash has been created with that in mind:

/// A SHA 1 hash digest
Sha1([u8; SIZE_OF_SHA1_DIGEST]),

Right now it's an enum and one could add another variant to it, behind a feature toggle, as the size of these hash digests definitely affects the resource requirements of operations like clone. As the size of these hashes would increase, this would affect the memory consumption even when cloning SHA1 repositories (unfortunately).
Maybe there are better ways to represent this - I think Git simply allocates for it which would be one solution here as well.

In any case, if this were to move forward, I think it would be good to review the hash representation first, as it's very relevant for performance.

@Byron

make sure once Sha256 is added as ObjectId variant, that it's behind a feature toggle to allow builds that opt-out of SHA256 support to not unnecessarily use more memory than needed. Maybe there are alternatives to this, too.

should the old SHA1 support also be put behind a feature toggle (enabled by default), so that builds can choose to only support SHA256 ?

I see the point and gitoxide should allow to be tailored to what application developers need. However, it by default it will try to be compatible to Git which means SHA1 and SHA256 must be supported, which is certainly what this would start out as.
Once there is explicit demand, I think a feature toggle for turning off SHA1 support could be contributed. This will be some work though, as the test-suite directly refers to it in plenty of places.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-tracking-issue An issue to track to track the progress of multiple PRs or issues help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

5 participants