Skip to content

Is1304/add internal traefik instance for reverse proxy #1313

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

sanderegg
Copy link
Member

@sanderegg sanderegg commented Feb 25, 2020

What do these changes do?

Adds an internal Traefik instance in the simcore stack to replace the webserver failing reverse-proxy. All calls to /x/{node_uuid} are redirected through Traefik to the corresponding dynamic service.
This has the following consequences:
traefik:

  • all traffic going that needs to be reverse-proxied by the internal traefik instance must be labelled with io.simcore.zone: ${TRAEFIK_SIMCORE_ZONE} or it will be picked up by the external Traefik instance.
  • the traefik dashboard when running the stack in devel mode (traefik does not handle 2 dashboards at the moment) - in devel mode, the traefik UI is accessible through http://127.0.0.1:8080

director:

  • the director now sets the dedicated traefik labels on the spawned services so they are available through the reverse proxy at /x/{node_uuid} as before
  • depending on the service reverse-proxy special settings, the director will set the traefik strip_regex middleware on the service if the strip_path:true is set on the service (currently used for the 3d-viewer).

frontend:

  • since the internal traefik instance now takes care of the reverse-proxy the frontend needs to poll the service to wait till the reverse-proxy is setup.

additional configurations:

  • TRAEFIK_SIMCORE_ZONE: defines the name of the internal traefik zone (it acts as a constraint that traefik recognize to filter services that need reverse-proxy service)
  • DIRECTOR_SELF_SIGNED_SSL_SECRET_ID, DIRECTOR_SELF_SIGNED_SSL_SECRET_NAME, DIRECTOR_SELF_SIGNED_SSL_FILENAME are used when deploying simcore in a osparc-ops environment where self-signed certificates are used. these certificates are passed down to the spawned services.

IMPORTANT: the 3D-viewer 3.x must be used from now on. version 2.x will fail.
IMPORTANT: a corresponding PR in osparc-ops shall be used to make the clusters compatible

Bonus:

  • adds Hadolint in launch.json for linting Dockerfiles
  • adds director in launch.json the list of remote python debugged services
  • verbosifies some scripts

Related issue number

closes #1304

How to test

make build
make up
# then open http://localhost:9081

Checklist

  • Did you change any service's API? Then make sure to bundle document and upgrade version (make openapi-specs, git commit ... and then make version-*)
  • Unit tests for the changes exist
  • Runs in the swarm
  • Documentation reflects the changes
  • New module? Add your github username to .github/CODEOWNERS

@sanderegg sanderegg added this to the Mithos milestone Feb 25, 2020
@sanderegg sanderegg self-assigned this Feb 25, 2020
@codecov
Copy link

codecov bot commented Feb 25, 2020

Codecov Report

Merging #1313 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #1313   +/-   ##
=======================================
  Coverage   60.66%   60.66%           
=======================================
  Files         241      241           
  Lines        9574     9574           
  Branches     1056     1056           
=======================================
  Hits         5808     5808           
  Misses       3489     3489           
  Partials      277      277           
Flag Coverage Δ
#integrationtests 58.85% <0.00%> (ø)
#unittests 54.92% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c2d57c1...c2d57c1. Read the comment docs.

@sanderegg sanderegg changed the title Is1304/add internal traefik instance for reverse proxy WIP: Is1304/add internal traefik instance for reverse proxy Feb 26, 2020
@sanderegg sanderegg force-pushed the is1304/add_internal_traefik_instance_for_reverse_proxy branch from 6e8b156 to 12da15f Compare March 18, 2020 17:01
@pcrespov pcrespov removed their request for review March 31, 2020 08:54
@sanderegg sanderegg force-pushed the is1304/add_internal_traefik_instance_for_reverse_proxy branch 3 times, most recently from d375d30 to 01e9ded Compare April 7, 2020 14:28
@sanderegg sanderegg force-pushed the is1304/add_internal_traefik_instance_for_reverse_proxy branch 2 times, most recently from 5191df6 to d026505 Compare April 8, 2020 07:16
Copy link
Member

@odeimaiz odeimaiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Frontend part looks good to me 👍

@sanderegg sanderegg modified the milestones: Dim Sum, Zhong Zi Apr 16, 2020
Copy link
Member

@pcrespov pcrespov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, if I understand it well, in devel mode:

  • All calls to http://127.0.0.1:8080 are routed into the traefik dashboard
  • All calls to http://127.0.0.1:8082 exposes promethous metrics of traefik
  • All http traffik to 9081 is forwarded either to the webserver or any of the backend services spawned by the director with the appropriate labels.

I have the following questions:

  1. http://172.0.0.1:8080/ or http://172.0.0.1:8080/dashboard or http://172.0.0.1:8080/api returns 404 in devel-mode
  2. are rules for webserver (hostregexp(`{host:.+}`)) and services spawned by the director (PathPrefix(`/x/{node_Uuid}`)) overlapping? e.g. http://172.0.0.1:9081/x/123 fulfill both rules? how is this resolved?

@sanderegg sanderegg requested a review from pcrespov April 16, 2020 16:57
@sanderegg
Copy link
Member Author

sanderegg commented Apr 16, 2020

So in general you got that right:

  1. http://172.0.0.1:8080/ or http://172.0.0.1:8080/dashboard or http://172.0.0.1:8080/api returns 404 in devel-mode

You need to add / after dashboard

  1. are rules for webserver (hostregexp(`{host:.+}`)) and services spawned by the director (PathPrefix(`/x/{node_Uuid}`)) overlapping? e.g. http://172.0.0.1:9081/x/123 fulfill both rules? how is this resolved?

Yes that is correct.
Traefik handles a priority rule by default that is a bit weird to understand see https://docs.traefik.io/routing/routers/#priority
otherwise, which I find clearer is to set a priority to the router. the lowest number 1 being the lowest priority. see that the director sets a priority of 10 to ensure the /x/uuid rule is tested first.

Anyway I discovered some things while playing with the osparc-ops that I will try to add in here to make things clearer (especially with the dashboard and api). The problem is that the dashboard has issues with multiple traefik instances but I'll try to have a way to see both of them.

Copy link
Member

@pcrespov pcrespov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Traefik logs in devel mode show some warnings and one error:

time="2020-04-16T18:00:51Z" level=info msg="Configuration loaded from flags.",
time="2020-04-16T18:00:51Z" level=warning msg="Could not initialize jaeger tracer: lookup jaeger on 127.0.0.11:53: no such host",
time="2020-04-16T18:00:51Z" level=warning msg="Unable to create tracer: lookup jaeger on 127.0.0.11:53: no such host",
time="2020-04-16T18:01:06Z" level=error msg="middleware \"gzip@docker\" does not exist" routerName=api_internal@docker entryPointName=traefik_dashboard,

@pcrespov
Copy link
Member

~/devp/osparc-simcore$ wget http://127.0.0.1:8080/dashboard/
--2020-04-16 20:04:59--  http://127.0.0.1:8080/dashboard/
Connecting to 127.0.0.1:8080... connected.
HTTP request sent, awaiting response... 404 Not Found
2020-04-16 20:04:59 ERROR 404: Not Found.

@sanderegg
Copy link
Member Author

will check it out.

changed default port to 80
set published port of traefik to 9081
set published port of webserver to random
@sanderegg
Copy link
Member Author

so my guess is that you were actually using make up-prod and not make up-devel which is what I meant by devel mode.

anyway this is fixed now by moving these labels in docker-compose.local.yml so it will now work in both cases.

so now you will find the following:

@sanderegg sanderegg requested a review from pcrespov April 16, 2020 19:44
Copy link
Member

@pcrespov pcrespov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent. Now it works fine !

@@ -40,7 +40,7 @@ services:
environment:
- SC_BOOT_MODE=${SC_BOOT_MODE:-default}
ports:
- ${SIMCORE_PORT:-9081}:8080
- "8080"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

much better

@@ -88,7 +104,7 @@ services:
init: true
deploy:
mode: replicated
replicas: 4
replicas: 8
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we have also replicas of traefik ?

@sanderegg sanderegg merged commit 6335ad3 into ITISFoundation:master Apr 17, 2020
@sanderegg sanderegg deleted the is1304/add_internal_traefik_instance_for_reverse_proxy branch April 17, 2020 09:20
@sanderegg sanderegg mentioned this pull request Apr 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

remove reverse proxy from webserver and use another instance of traefik
3 participants