Skip to content

🎨Storage: HA (🚨🚨🚨 test with multiple replicas) #7375

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

sanderegg
Copy link
Member

@sanderegg sanderegg commented Mar 17, 2025

What do these changes do?

This PR shall unlock having multiple replicas of storage micro-service by:

  • moving copying of files to the storage worker via Celery
  • copying of files is now triggered via RPC instead of REST (the REST API POST /simcore-s3/folders is gone)
  • moving completion of multipart upload from in-process task to Celery task, e.g. offloaded to the storage worker (here the REST API is kept for backwards compatibility + it is already a long running task)
  • unit tests now run the storage manager + celery (with in-memory transfer) + storage worker in the same process to allow testing (there is still leeway to improve this) see test_rpc_handlers_paths
  • --> now testing storage also need the storage worker to be up

Bonus:

  • fixes shaky storage worker: wait for the fastapi initialization to complete

Related issue/s

How to test

Dev-ops checklist

@sanderegg sanderegg self-assigned this Mar 17, 2025
@sanderegg sanderegg added the a:storage issue related to storage service label Mar 17, 2025
@sanderegg sanderegg added this to the The Awakening milestone Mar 17, 2025
Copy link

codecov bot commented Mar 17, 2025

Codecov Report

Attention: Patch coverage is 82.91139% with 27 lines in your changes missing coverage. Please review.

Project coverage is 88.03%. Comparing base (5bbcc74) to head (e39a883).
Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #7375      +/-   ##
==========================================
+ Coverage   87.43%   88.03%   +0.60%     
==========================================
  Files        1719     1337     -382     
  Lines       66692    55309   -11383     
  Branches     1132      586     -546     
==========================================
- Hits        58310    48692    -9618     
+ Misses       8061     6431    -1630     
+ Partials      321      186     -135     
Flag Coverage Δ
integrationtests 65.22% <43.75%> (+0.06%) ⬆️
unittests 86.21% <82.91%> (-0.40%) ⬇️
Components Coverage Δ
api ∅ <ø> (∅)
pkg_aws_library ∅ <ø> (∅)
pkg_dask_task_models_library ∅ <ø> (∅)
pkg_models_library ∅ <ø> (∅)
pkg_notifications_library ∅ <ø> (∅)
pkg_postgres_database ∅ <ø> (∅)
pkg_service_integration ∅ <ø> (∅)
pkg_service_library 72.64% <0.00%> (-0.23%) ⬇️
pkg_settings_library ∅ <ø> (∅)
pkg_simcore_sdk 85.46% <ø> (ø)
agent 96.46% <ø> (ø)
api_server 90.76% <ø> (ø)
autoscaling 96.08% <ø> (ø)
catalog 91.82% <ø> (ø)
clusters_keeper 99.24% <ø> (ø)
dask_sidecar 91.25% <ø> (ø)
datcore_adapter 98.11% <ø> (ø)
director 76.70% <ø> (+0.09%) ⬆️
director_v2 91.28% <ø> (-0.02%) ⬇️
dynamic_scheduler 97.33% <ø> (ø)
dynamic_sidecar 90.12% <ø> (+0.01%) ⬆️
efs_guardian 89.79% <ø> (ø)
invitations 93.28% <ø> (ø)
payments 92.66% <ø> (ø)
resource_usage_tracker 89.12% <ø> (-0.17%) ⬇️
storage 86.61% <97.56%> (+0.17%) ⬆️
webclient ∅ <ø> (∅)
webserver 88.21% <68.75%> (+2.37%) ⬆️

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5bbcc74...e39a883. Read the comment docs.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@sanderegg sanderegg force-pushed the storage/complete-upload-worker-task branch 3 times, most recently from fa6d169 to 7a87169 Compare March 20, 2025 21:06
@sanderegg sanderegg changed the title 🎨Maintenance: added async task for completing upload of file 🎨Storage: HA Mar 20, 2025
@sanderegg sanderegg force-pushed the storage/complete-upload-worker-task branch from 7a87169 to 2977870 Compare March 20, 2025 21:29
@sanderegg sanderegg mentioned this pull request Mar 21, 2025
1 task
@sanderegg sanderegg force-pushed the storage/complete-upload-worker-task branch 5 times, most recently from 9cc4ae8 to 78adcad Compare March 25, 2025 11:08
@sanderegg sanderegg marked this pull request as ready for review March 25, 2025 12:24
@sanderegg sanderegg requested a review from bisgaard-itis March 25, 2025 12:24
@sanderegg sanderegg changed the title 🎨Storage: HA 🎨Storage: HA (🚨🚨🚨 test with multiple replicas) Mar 25, 2025
@sanderegg sanderegg requested a review from YuryHrytsuk March 25, 2025 12:25
@sanderegg sanderegg force-pushed the storage/complete-upload-worker-task branch from 016cdec to 709f4d0 Compare March 25, 2025 12:26
YuryHrytsuk added a commit to YuryHrytsuk/osparc-ops-environments that referenced this pull request Mar 25, 2025
Since storage can be replicated, we make number of its replicas
configurable via ENV

Related PR: ITISFoundation/osparc-simcore#7375
@YuryHrytsuk
Copy link
Contributor

Related to #5621

Copy link
Contributor

@YuryHrytsuk YuryHrytsuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks 🙏 Very much appreciated 🎉

Please, once this is deployed, merge changes in OPS (including related config changes). Or just let me know, I will do it 🚀

ITISFoundation/osparc-ops-environments#1000

Copy link
Contributor

@GitHK GitHK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. Have 1 question

@sanderegg sanderegg force-pushed the storage/complete-upload-worker-task branch from f501664 to 02d1fee Compare March 25, 2025 14:22
@sanderegg sanderegg force-pushed the storage/complete-upload-worker-task branch from 38d6642 to 3907cf9 Compare March 27, 2025 15:20
@sanderegg sanderegg merged commit c9296d7 into ITISFoundation:master Mar 27, 2025
92 of 94 checks passed
@sanderegg sanderegg deleted the storage/complete-upload-worker-task branch March 27, 2025 17:10
sanderegg pushed a commit to ITISFoundation/osparc-ops-environments that referenced this pull request Mar 27, 2025
Since storage can be replicated, we make number of its replicas
configurable via ENV

Related PR: ITISFoundation/osparc-simcore#7375
YuryHrytsuk added a commit to YuryHrytsuk/osparc-ops-environments that referenced this pull request Mar 28, 2025
YuryHrytsuk added a commit to ITISFoundation/osparc-ops-environments that referenced this pull request Mar 28, 2025
@matusdrobuliak66 matusdrobuliak66 mentioned this pull request Apr 15, 2025
56 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a:storage issue related to storage service
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Storage: Handle multiple replicas of storage
6 participants