Skip to content

check service overload at health entrypoint #1401

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

pcrespov
Copy link
Member

@pcrespov pcrespov commented Mar 23, 2020

What do these changes do?

Introduces minimal diagnostics to invalidate healthcheck entrypoint and restart weberver automaticaly.

  • servicelib.monitor_slow_callbacks: hooks look event handler and registers an incident when there is a slow callback
  • webserver new diagnostics modules that keeps track of incidents
  • webserver healthcheck now returns 503 when it is overloaded to long-delayed callbacks. Then swarm will restart the webserver.

Got complicated because of the following UPGRADES in webserver service

  • requirements
    • Fixes due to changes in traferet 2.0.2
  • Dockerfile (from alpine to debian)
    • docker/*sh scripts
    • fix pip in editable mode MUST NOT produce root folders in mounted volumes

How to test

cd package/service-library
make install-dev
make tests

cd ../services/web/server
make install-dev
make tests-unit

Checklist

  • Did you change any service's API? Then make sure to bundle document and upgrade version (make openapi-specs, git commit ... and then make version-*)
  • Unit tests for the changes exist
  • Runs in the swarm
  • Documentation reflects the changes
  • New module? Add your github username to .github/CODEOWNERS

@pcrespov pcrespov self-assigned this Mar 23, 2020
@codecov
Copy link

codecov bot commented Mar 23, 2020

Codecov Report

Merging #1401 into master will increase coverage by 0.22%.
The diff coverage is 93.2%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1401      +/-   ##
==========================================
+ Coverage   70.52%   70.75%   +0.22%     
==========================================
  Files         222      225       +3     
  Lines        8818     8914      +96     
  Branches      968      979      +11     
==========================================
+ Hits         6219     6307      +88     
- Misses       2322     2324       +2     
- Partials      277      283       +6
Flag Coverage Δ
#integrationtests 57.19% <52.08%> (-0.05%) ⬇️
#unittests 65.32% <93.2%> (+0.28%) ⬆️
Impacted Files Coverage Δ
packages/simcore-sdk/src/simcore_sdk/config/s3.py 0% <ø> (ø) ⬆️
.../server/src/simcore_service_webserver/db_config.py 100% <ø> (ø) ⬆️
...r/src/simcore_service_webserver/socketio/config.py 92.3% <ø> (ø) ⬆️
...r/src/simcore_service_webserver/director/config.py 100% <ø> (ø) ⬆️
packages/service-library/src/servicelib/tracing.py 0% <ø> (ø) ⬆️
...rver/src/simcore_service_webserver/login/config.py 100% <ø> (ø) ⬆️
...rver/src/simcore_service_webserver/email_config.py 100% <ø> (ø) ⬆️
...kages/simcore-sdk/src/simcore_sdk/config/rabbit.py 0% <ø> (ø) ⬆️
...er/src/simcore_service_webserver/catalog_config.py 91.66% <ø> (ø) ⬆️
...r/src/simcore_service_webserver/activity/config.py 100% <ø> (ø) ⬆️
... and 16 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c29beaf...a79ff48. Read the comment docs.

@pcrespov pcrespov changed the title Enh/check overload health WIP: check overload health Mar 23, 2020
@pcrespov pcrespov changed the title WIP: check overload health WIP: check service overload at health entrypoint Mar 23, 2020
@pcrespov pcrespov force-pushed the enh/check-overload-health branch from 87b34c1 to 6b0bb83 Compare March 23, 2020 18:40
@pcrespov pcrespov marked this pull request as ready for review March 24, 2020 15:28
@pcrespov pcrespov changed the title WIP: check service overload at health entrypoint check service overload at health entrypoint Mar 24, 2020
@pcrespov pcrespov added a:services-library issues on packages/service-libs a:webserver issue related to the webserver service dependencies t:enhancement Improvement or request on an existing feature labels Mar 24, 2020
@pcrespov pcrespov added this to the Dim Sum milestone Mar 24, 2020
@pcrespov pcrespov force-pushed the enh/check-overload-health branch from 2f90fe2 to 89a1d43 Compare March 24, 2020 23:02
@pcrespov
Copy link
Member Author

Splits in PR #1406 and yet another with upgrades of Dockerfile and requirements coming later

1 similar comment
@pcrespov
Copy link
Member Author

Splits in PR #1406 and yet another with upgrades of Dockerfile and requirements coming later

@pcrespov pcrespov closed this Mar 25, 2020
@pcrespov pcrespov mentioned this pull request Mar 26, 2020
5 tasks
@pcrespov pcrespov deleted the enh/check-overload-health branch August 17, 2020 07:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a:services-library issues on packages/service-libs a:webserver issue related to the webserver service t:enhancement Improvement or request on an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant