Skip to content

🐛 Fixes catalog's synchronization background task continues errors due to faulty service info #6344

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Sep 11, 2024

Conversation

pcrespov
Copy link
Member

@pcrespov pcrespov commented Sep 10, 2024

What do these changes do?

Prevents the background sync task in catalog from continuously "stopping and restarting" due to errors in the director's response for certain items in the list. This change ensures that the task can still complete for the remaining items despite errors.

This is the error recurrently logged by the background sync task:

log_level=DEBUG | log_timestamp=2024-09-10 13:40:01,893 | log_source=httpcore.http11:atrace(85) | log_uid=None | log_msg=response_closed.complete
log_level=ERROR | log_timestamp=2024-09-10 13:40:01,893 | log_source=simcore_service_catalog.core.background_tasks:_sync_services_task(190) | log_uid=None | log_msg=Unexpected error while syncing registry entries, restarting now...
Traceback (most recent call last):
  File "/home/scu/.venv/lib/python3.11/site-packages/simcore_service_catalog/core/background_tasks.py", line 177, in _sync_services_task
    await _run_sync_services(app)
  File "/home/scu/.venv/lib/python3.11/site-packages/simcore_service_catalog/core/background_tasks.py", line 165, in _run_sync_services
    await _ensure_registry_and_database_are_synced(app)
  File "/home/scu/.venv/lib/python3.11/site-packages/simcore_service_catalog/core/background_tasks.py", line 114, in _ensure_registry_and_database_are_synced
    await _create_services_in_database(
  File "/home/scu/.venv/lib/python3.11/site-packages/simcore_service_catalog/core/background_tasks.py", line 74, in _create_services_in_database
    ) = await access_rights.evaluate_default_policy(app, service_metadata)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/scu/.venv/lib/python3.11/site-packages/simcore_service_catalog/services/access_rights.py", line 73, in evaluate_default_policy
    if _is_frontend_service(service) or await _is_old_service(app, service):
                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/scu/.venv/lib/python3.11/site-packages/simcore_service_catalog/services/access_rights.py", line 43, in _is_old_service
    await client.get(
  File "/home/scu/.venv/lib/python3.11/site-packages/simcore_service_catalog/services/director.py", line 94, in request_wrapper
    return _unenvelope_or_raise_error(resp)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/scu/.venv/lib/python3.11/site-packages/simcore_service_catalog/services/director.py", line 71, in _unenvelope_or_raise_error
    raise HTTPException(resp.status_code, detail=msg)
fastapi.exceptions.HTTPException: 404

We have improved the resilience of these routines by catching expected exceptions, issuing a warning, and continuing to process the remaining elements in the loop without interruption.

Misc

  • extra tmp timeout for @mguidon until issue is fully resolved

Related issue/s

How to test

cd services/catalog
make install-dev
# when parameter `director_fails=True`
pytest -vv tests/unit/with_dbs -k test_registry_sync_task

Dev-ops checklist

@pcrespov pcrespov self-assigned this Sep 10, 2024
@pcrespov pcrespov added the a:catalog catalog service label Sep 10, 2024
@pcrespov pcrespov added this to the Eisbock milestone Sep 10, 2024
Copy link

codecov bot commented Sep 10, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.3%. Comparing base (cafbf96) to head (a10025e).
Report is 529 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #6344      +/-   ##
=========================================
- Coverage    84.5%   82.3%    -2.3%     
=========================================
  Files          10    1103    +1093     
  Lines         214   48647   +48433     
  Branches       25     925     +900     
=========================================
+ Hits          181   40068   +39887     
- Misses         23    8408    +8385     
- Partials       10     171     +161     
Flag Coverage Δ
integrationtests 64.3% <ø> (?)
unittests 79.9% <100.0%> (-4.7%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...icelib/rabbitmq/rpc_interfaces/catalog/services.py 0.0% <ø> (ø)
...c/simcore_service_catalog/core/background_tasks.py 92.3% <100.0%> (ø)
...g/src/simcore_service_catalog/exceptions/errors.py 93.3% <100.0%> (ø)
.../simcore_service_catalog/services/access_rights.py 98.7% <100.0%> (ø)

... and 1109 files with indirect coverage changes

@pcrespov pcrespov changed the title Is6318/fix catalog 🐛 Fixes catalog's synchronization background task continues errors due to faulty service info Sep 10, 2024
@pcrespov pcrespov marked this pull request as ready for review September 10, 2024 16:36
@pcrespov pcrespov enabled auto-merge (squash) September 10, 2024 16:37
Copy link
Collaborator

@matusdrobuliak66 matusdrobuliak66 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks.

Copy link
Member

@sanderegg sanderegg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perfect, thanks!

Copy link
Contributor

@GitHK GitHK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@pcrespov pcrespov disabled auto-merge September 11, 2024 15:22
@pcrespov pcrespov merged commit 9380819 into ITISFoundation:master Sep 11, 2024
2 checks passed
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a:catalog catalog service
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Director stops synchronizing with registry if there is a problem
4 participants