Skip to content

feat: keep a hint of last snapshot location while committing new snapshot #7418

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Sep 2, 2022

Conversation

dantengsky
Copy link
Member

@dantengsky dantengsky commented Aug 31, 2022

I hereby agree to the terms of the CLA available at: https://databend.rs/dev/policies/cla/

Summary

Background:

this pr is a sub aux task of database object sharing.

By keeping a hint file for each table, which contains the location of the table's last snapshot, the sharing service could work by accessing object storage only (without touching metastore service or databend-query nodes).

@ZhiHanZ @flaneur2020 please correct me if the above summary is not right.

some refs:

Functionality of this PR

whenever a table has successfully committed, try keeping a hint file that indicates the location of the last snapshot of given table:

  • at "fixed" location
    /$STORAGE_PREFIX/$DB_ID/$TABLE_ID/last_snapshot_location_hint
  • the content is a plain string, which describes the location of the latest snapshot
    /$STORAGE_PREFIX/$DB_ID/$TABLE_ID/xxxxx.json
    where
  • STORAGE_PREFIX is configured by the parameter root of storage types

e.g. for local file system storage type, the hint looks like:

  • path /tmp/.tmpNP66QH/1/11/last_snapshot_location_hint
  • content of the above file /tmp/.tmpNP66QH/1/11/_ss/28a2109a67544c73a5af922acbabce68_v1.json

NOTE:

    Just try our best to write down the hint file of the last snapshot
     - will retry in the case of temporary io failure
    but
     - errors are ignored if writing is eventually failed
     - errors (if any) will not be propagated (just warning log messages left)
     - "data race" ignored
       if multiple different versions of hints are written concurrently,
       it is NOT guaranteed that the latest version will be kept

misc:

  • refactor. unify table committing path

Fixes #issue

@vercel
Copy link

vercel bot commented Aug 31, 2022

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Updated
databend ✅ Ready (Inspect) Visit Preview Sep 2, 2022 at 11:32AM (UTC)

@dantengsky dantengsky changed the title Feat: keep a hint of last snapshot location while committing new snapshot feat: keep a hint of last snapshot location while committing new snapshot Aug 31, 2022
@dantengsky dantengsky force-pushed the feat-keep-last-snapshot-hint branch from 9740b81 to fda2531 Compare August 31, 2022 10:51
@mergify mergify bot added the pr-feature this PR introduces a new feature to the codebase label Aug 31, 2022
@dantengsky dantengsky force-pushed the feat-keep-last-snapshot-hint branch from bac8eae to beabab9 Compare August 31, 2022 15:41
@dantengsky dantengsky marked this pull request as ready for review August 31, 2022 17:50
Copy link
Member

@zhyass zhyass left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mergify mergify bot merged commit 1e72156 into databendlabs:main Sep 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-feature this PR introduces a new feature to the codebase
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants