Skip to content

PEP 740 persistence, take 3 #16624

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 20 commits into from
Closed

Conversation

woodruffw
Copy link
Member

@woodruffw woodruffw commented Sep 3, 2024

WIP.

Action plan:

This reverts commit b6cf775.

This is a breakout from pypi#16624,
to reduce the complexity/headache of zippered reverts.

Signed-off-by: William Woodruff <[email protected]>
Signed-off-by: William Woodruff <[email protected]>
Signed-off-by: William Woodruff <[email protected]>
@woodruffw
Copy link
Member Author

I discussed the architectural challenges here a bit with @ewdurbin, and we came to the conclusion that the current scale/scope and expected volume doesn't warrant the complexity that's coming from the attestations feature needing to interact with object storage.

Given that files can currently only have a single attestation, our current conclusion is that the DB should contain a Provenance model, which in turn bears the individual attestation objects/bundles as members of a JSONB list.

In other words, the current data model of File -> [Attestation] will become File -> Provenance?, where Provenance tracks an interior set of attestation bundles as JSONB.

Signed-off-by: William Woodruff <[email protected]>
Signed-off-by: William Woodruff <[email protected]>
@ewdurbin
Copy link
Member

ewdurbin commented Sep 4, 2024

Hm, I think I came to a slightly different mental model than you.

I imagined immutable Attestation objects remaining as a concrete-db-backed model and Provenance being rendered from all attestations related to a file by joining them up then generating a hash. Bonus: this retains the ability for additional attestations to be added down the line.

@woodruffw
Copy link
Member Author

woodruffw commented Sep 4, 2024

I imagined immutable Attestation objects remaining as a concrete-db-backed model and Provenance being rendered from all attestations related to a file by joining them up then generating a hash. Bonus: this retains the ability for additional attestations to be added down the line.

I messed around with this a bit, but it has the "lifecycle" problem that our earlier design (which avoided persisting provenance objects to artifact storage at all) had: the provenance object needs access to the uploading request object to persist the appropriate Trusted Publisher metadata + claims. We could generate it outside of that request context, but doing so requires another mechanism to hold onto that request state. That becomes especially annoying with the OIDC claims, since those are a free-structured dict that aren't part of the TP state itself 😅

I'm halfway through rebasing this PR on top of #16625, which will hopefully make it a lot simpler and demonstrate how I think (I hope) we can still do mutability while sticking to only a single concrete DB-backed Provenance model. But if the rebased version isn't what you're thinking, I'm happy to undo and try and push forwards with a "collect the provenance from concrete DB-based attestations" approach 🙂

Signed-off-by: William Woodruff <[email protected]>
Signed-off-by: William Woodruff <[email protected]>
This reverts commit b6cf775.

Signed-off-by: William Woodruff <[email protected]>
Signed-off-by: William Woodruff <[email protected]>
def parse_attestations(
self, request: Request, _distribution: Distribution
) -> list[Attestation]:
publisher: OIDCPublisher | None = request.oidc_publisher
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FLAG: This test is performed in the IntegrityService but was not done here and adding it highlighted some issues.

DarkaMaul and others added 2 commits September 5, 2024 17:17
Signed-off-by: William Woodruff <[email protected]>
di added a commit that referenced this pull request Sep 11, 2024
* warehouse: PEP 740 models

This is a breakout from #16624,
to reduce the complexity/headache of zippered reverts.

Signed-off-by: William Woodruff <[email protected]>

* tests: AttestationFactory helper

Signed-off-by: William Woodruff <[email protected]>

* tests: attestation coverage

Signed-off-by: William Woodruff <[email protected]>

* switch to a Provenance model

Signed-off-by: William Woodruff <[email protected]>

* provenance model carries a digest

Signed-off-by: William Woodruff <[email protected]>

* remove unused factory

Signed-off-by: William Woodruff <[email protected]>

* use SHA-256 for provenance digest

Signed-off-by: William Woodruff <[email protected]>

* mark provenance column as deferred

Signed-off-by: William Woodruff <[email protected]>

* Add a ProvenanceFactory object and a simple test.

* Revert "Add a ProvenanceFactory object and a simple test."

This reverts commit 34ec661.

* Update warehouse/packaging/models.py

---------

Signed-off-by: William Woodruff <[email protected]>
Co-authored-by: Alexis <[email protected]>
Co-authored-by: Dustin Ingram <[email protected]>
woodruffw added a commit to trail-of-forks/warehouse that referenced this pull request Sep 11, 2024
Breakout from pypi#16624.

Signed-off-by: William Woodruff <[email protected]>
woodruffw added a commit to trail-of-forks/warehouse that referenced this pull request Sep 27, 2024
Broken out from pypi#16624.

Signed-off-by: William Woodruff <[email protected]>
@woodruffw
Copy link
Member Author

Superseded by #16801.

@woodruffw woodruffw closed this Sep 27, 2024
woodruffw added a commit to trail-of-forks/warehouse that referenced this pull request Oct 3, 2024
Broken out from pypi#16624.

Signed-off-by: William Woodruff <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants