Skip to content

feat(relocation): Add GCP-backed export checkpointer #80803

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 18, 2024

Conversation

azaslavsky
Copy link
Contributor

This builds on the work of #80711 to add a GCP-backed ExportCheckpointer implementation. Now, when exporting, we save (always encrypted!) copies of the progress on each model kind seen so far. If the export fails halfway through, we can use these checkpoints to recover much more quickly then if we had to redo all of that work, ensuring a higher chance of success on the retry.

@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Nov 15, 2024
Comment on lines -218 to -221
@abstractmethod
def read(self) -> bytes:
pass

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All changes to this file remove an unused method.

Copy link

codecov bot commented Nov 15, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Additional details and impacted files
@@           Coverage Diff           @@
##           master   #80803   +/-   ##
=======================================
  Coverage   74.97%   74.97%           
=======================================
  Files        7207     7207           
  Lines      319509   319501    -8     
  Branches    44001    44001           
=======================================
- Hits       239546   239541    -5     
+ Misses      73442    73439    -3     
  Partials     6521     6521           

This builds on the work of #80711 to add a GCP-backed
`ExportCheckpointer` implementation. Now, when exporting, we save
(always encrypted!) copies of the progress on each model kind seen so
far. If the export fails halfway through, we can use these checkpoints
to recover much more quickly then if we had to redo all of that work,
ensuring a higher chance of success on the retry.
@azaslavsky azaslavsky force-pushed the azaslavsky/gcp-export-checkpointer branch from d9af536 to 7e75bca Compare November 18, 2024 18:01
@azaslavsky azaslavsky marked this pull request as ready for review November 18, 2024 20:35
@azaslavsky azaslavsky requested a review from a team November 18, 2024 20:35
if parsed_json is None:
logger.info(
"Export checkpointer: miss",
extra=logger_data,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wanted to sanity check: There will be no sensitive data exposed here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope: just the name of the model, the UUID of the relocation (safe), and the size (in bytes) of the encrypted data used in the checkpoint. None of that is sensitive.

@azaslavsky azaslavsky merged commit 283cfd0 into master Nov 18, 2024
49 checks passed
@azaslavsky azaslavsky deleted the azaslavsky/gcp-export-checkpointer branch November 18, 2024 22:19
Copy link

sentry-io bot commented Nov 19, 2024

Suspect Issues

This pull request was deployed and Sentry observed the following issues:

Did you find this useful? React with a 👍 or 👎

@github-actions github-actions bot locked and limited conversation to collaborators Dec 4, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Scope: Backend Automatically applied to PRs that change backend components
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants