Skip to content

✨Storage: new paths entrypoint with pagination #7200

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 28 commits into from
Feb 26, 2025

Conversation

sanderegg
Copy link
Member

@sanderegg sanderegg commented Feb 10, 2025

Screencast.from.2025-02-25.17-13-48.mp4

Webserver API change
ReDoc

Storage API change
ReDoc

What do these changes do?

reminder: storage caches partially the files in S3 in the DB.

  1. files from the file-picker, service inputs/outputs/logs are cached in the DB
  2. files from a dynamic service state folder are not cached in the DB, only the base folder is. --> Any listing inside these folders implies direct calls to the S3 backend.

Summary

This PR adds a new entrypoint in storage that allows to list files/folders (both inside DB/S3) with pagination.

Webserver API

  • New API entrypoint: GET /storage/locations/{location_id}/paths with file_filter query parameter
  • when file_filter is null then this will list the projects that have files,
  • when file_filter is some specific project ID, then it will list the nodes with files inside that project,
  • when file_filter is PROJECTID/NODEID then it will list the files/folders in that node that have files in them,
  • this goes all the way till the very last file
  • this is compatible with the simcore.s3 and datcore storages
  • moved storage tests to 01

Storage Rest API

  • implements the above with driving tests: test_handler_paths.py

Pagination

  • done via Cursor-based pagination (as the total is not known in S3),
  • --> calls to GET /storage/locations/{location_id}/paths can have limit and cursor query parameters
  • --> initial call shall have a cursor set to null or not passed
  • --> if there are more files, the response body will contain a next_page (the next cursor), that shall be passed with the next call to GET /storage/locations/{location_id}/paths to get the next page
  • --> total field is only filled when the calls are run solely against the DB (so do not rely on it as it will be empty if the call is done against S3)

AWS-library

  • added S3 functions to list page of objects
  • added S3 functions to count objects
  • driving tests: test_s3_client.py

Requirements

  • unification of types-aioboto3 as there were missing functions

Next steps

  • @odeimaiz to implement the frontend
  • @sanderegg to remove the old entrypoints and continuing cleanup/fixing

Related issue/s

How to test

Dev-ops checklist

@sanderegg sanderegg added a:storage issue related to storage service a:webserver issue related to the webserver service labels Feb 10, 2025
@sanderegg sanderegg added this to the Singularity milestone Feb 10, 2025
@sanderegg sanderegg self-assigned this Feb 10, 2025
Copy link

codecov bot commented Feb 10, 2025

Codecov Report

Attention: Patch coverage is 71.22153% with 139 lines in your changes missing coverage. Please review.

Project coverage is 87.10%. Comparing base (7dd1e0a) to head (d2b1df0).
Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #7200      +/-   ##
==========================================
+ Coverage   85.54%   87.10%   +1.56%     
==========================================
  Files        1680     1685       +5     
  Lines       65057    65437     +380     
  Branches     1106     1115       +9     
==========================================
+ Hits        55650    57000    +1350     
+ Misses       9093     8122     -971     
- Partials      314      315       +1     
Flag Coverage Δ
integrationtests 65.41% <33.33%> (-0.07%) ⬇️
unittests 86.12% <71.22%> (+1.81%) ⬆️
Components Coverage Δ
api 76.84% <ø> (ø)
pkg_aws_library 94.24% <96.55%> (+0.06%) ⬆️
pkg_dask_task_models_library 97.09% <ø> (ø)
pkg_models_library 91.67% <100.00%> (+0.05%) ⬆️
pkg_notifications_library 84.57% <ø> (ø)
pkg_postgres_database 88.28% <ø> (ø)
pkg_service_integration 70.03% <ø> (ø)
pkg_service_library 72.57% <ø> (ø)
pkg_settings_library 90.61% <ø> (ø)
pkg_simcore_sdk 85.46% <ø> (ø)
agent 96.46% <ø> (ø)
api_server 90.56% <ø> (ø)
autoscaling 96.08% <ø> (ø)
catalog 91.73% <ø> (ø)
clusters_keeper 99.24% <ø> (ø)
dask_sidecar 91.25% <ø> (ø)
datcore_adapter 98.06% <ø> (ø)
director 76.59% <ø> (ø)
director_v2 91.27% <ø> (-0.05%) ⬇️
dynamic_scheduler 97.33% <ø> (ø)
dynamic_sidecar 89.74% <ø> (ø)
efs_guardian 90.25% <ø> (ø)
invitations 93.28% <ø> (ø)
osparc_gateway_server ∅ <ø> (∅)
payments 92.66% <ø> (ø)
resource_usage_tracker 88.97% <ø> (-0.11%) ⬇️
storage 84.15% <63.68%> (-2.56%) ⬇️
webclient ∅ <ø> (∅)
webserver 85.17% <83.33%> (+5.47%) ⬆️

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7dd1e0a...d2b1df0. Read the comment docs.

@sanderegg sanderegg force-pushed the storage/add-pagination branch 8 times, most recently from 5672008 to ccaff88 Compare February 17, 2025 07:37
@sanderegg sanderegg force-pushed the storage/add-pagination branch from 7ab8d54 to 02ebf8d Compare February 20, 2025 15:47
@sanderegg sanderegg modified the milestones: Singularity, The Awakening Feb 24, 2025
@sanderegg sanderegg force-pushed the storage/add-pagination branch 2 times, most recently from 9bde281 to 8411473 Compare February 25, 2025 11:30
@sanderegg sanderegg marked this pull request as ready for review February 25, 2025 13:48
@sanderegg sanderegg changed the title ✨Storage: new entrypoints with pagination ✨Storage: new paths entrypoint with pagination Feb 25, 2025
Copy link
Member

@odeimaiz odeimaiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks a lot 👌

@sanderegg sanderegg force-pushed the storage/add-pagination branch from a74a198 to aad652b Compare February 25, 2025 15:30
Copy link
Member

@pcrespov pcrespov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! thx.

Copy link

Copy link
Collaborator

@matusdrobuliak66 matusdrobuliak66 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explanation 👍

@sanderegg sanderegg merged commit 387826f into ITISFoundation:master Feb 26, 2025
90 of 95 checks passed
@sanderegg sanderegg deleted the storage/add-pagination branch February 26, 2025 07:21
@matusdrobuliak66 matusdrobuliak66 mentioned this pull request Mar 6, 2025
63 tasks
mrnicegyu11 pushed a commit to mrnicegyu11/osparc-simcore that referenced this pull request Mar 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a:storage issue related to storage service a:webserver issue related to the webserver service
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants