Skip to content

krb5kdc-fixture intermittently fails to start #40624

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
alpar-t opened this issue Mar 29, 2019 · 4 comments · Fixed by #41710
Closed

krb5kdc-fixture intermittently fails to start #40624

alpar-t opened this issue Mar 29, 2019 · 4 comments · Fixed by #41710
Assignees
Labels
:Delivery/Build Build or test infrastructure :Security/Security Security issues without another label Team:Delivery Meta label for Delivery team >test-failure Triaged test failures from CI v7.2.0 v8.0.0-alpha1

Comments

@alpar-t
Copy link
Contributor

alpar-t commented Mar 29, 2019

Ran into this with a PR run: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+pull-request-1/11035/console

Waiting for TCP socket on 172.18.0.1:32769 of service 'hdfs_1' (Connection refused (Connection refused))
Waiting for TCP socket on 172.18.0.1:32769 of service 'hdfs_1' (Connection refused (Connection refused))
Starting process 'command 'docker''. Working directory: /var/lib/jenkins/workspace/elastic+elasticsearch+pull-request-1/test/fixtures/krb5kdc-fixture Command: docker logs --follow=false 65e4f2dd95101e51f0ddba9a40d7448b17ce2ecb16659cccc7e88b6120c4265b
Successfully started process 'command 'docker''
Starting process 'command '/usr/local/bin/docker-compose''. Working directory: /var/lib/jenkins/workspace/elastic+elasticsearch+pull-request-1/test/fixtures/krb5kdc-fixture Command: /usr/local/bin/docker-compose --no-ansi -f docker-compose.yml -p 78d469b805523ccd047df2ab0eb6ba41_krb5kdc-fixture_ stop --timeout 10
kadmin: GSS-API (or Kerberos) error while initializing kadmin interface

I found this SO: https://serverfault.com/questions/803662/kerberos-error-while-initializing-kadmin-interface-from-admin-server

Since we use ephemeral workers, the explanation seems plausible for us, the newly booted VMs have little entropy and can lead to this failure.

We should probably install Rng-tools on the CI workers to see if it helps.
The problem is not reproducible locally as one would expect as developers kernels will have lots of entropy.

@alpar-t alpar-t added :Delivery/Build Build or test infrastructure :Security/Security Security issues without another label labels Mar 29, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-security

@bizybot
Copy link
Contributor

bizybot commented Apr 26, 2019

Hi @atorok,
Since the infra has fixed the CI workers, maybe we can now just mount /dev/urandom into container and that should help in our fixture? I had same problem in other PR for Kerberos and I think it fixed after mounting:

- /dev/urandom:/dev/random

I can pick this up if you are okay, Thanks.

bizybot pushed a commit to bizybot/elasticsearch that referenced this issue May 1, 2019
Infra has fixed elastic#10462 by installing `haveged` on CI workers.
This commit enables the disabled fixture and tests, and mounts
`/dev/urandom` for the container so there is enough
entropy required for kdc.

Closes elastic#40624
bizybot added a commit to bizybot/elasticsearch that referenced this issue Jun 13, 2019
elastic#41710)

Infra has fixed elastic#10462 by installing `haveged` on CI workers.
This commit enables the disabled fixture and tests, and mounts
`/dev/urandom` for the container so there is enough
entropy required for kdc.
Note: hdfs-repository tests have been disabled, will raise a separate issue for it.

Closes elastic#40624 Closes elastic#40678
bizybot added a commit to bizybot/elasticsearch that referenced this issue Jun 13, 2019
elastic#41710)

Infra has fixed elastic#10462 by installing `haveged` on CI workers.
This commit enables the disabled fixture and tests, and mounts
`/dev/urandom` for the container so there is enough
entropy required for kdc.
Note: hdfs-repository tests have been disabled, will raise a separate issue for it.

Closes elastic#40624 Closes elastic#40678
bizybot added a commit to bizybot/elasticsearch that referenced this issue Jun 13, 2019
elastic#41710)

Infra has fixed elastic#10462 by installing `haveged` on CI workers.
This commit enables the disabled fixture and tests, and mounts
`/dev/urandom` for the container so there is enough
entropy required for kdc.
Note: hdfs-repository tests have been disabled, will raise a separate issue for it.

Closes elastic#40624 Closes elastic#40678
bizybot added a commit to bizybot/elasticsearch that referenced this issue Jun 13, 2019
elastic#41710)

Infra has fixed elastic#10462 by installing `haveged` on CI workers.
This commit enables the disabled fixture and tests, and mounts
`/dev/urandom` for the container so there is enough
entropy required for kdc.
Note: hdfs-repository tests have been disabled, will raise a separate issue for it.

Closes elastic#40624 Closes elastic#40678
bizybot added a commit to bizybot/elasticsearch that referenced this issue Jun 13, 2019
elastic#41710)

Infra has fixed elastic#10462 by installing `haveged` on CI workers.
This commit enables the disabled fixture and tests, and mounts
`/dev/urandom` for the container so there is enough
entropy required for kdc.
Note: hdfs-repository tests have been disabled, will raise a separate issue for it.

Closes elastic#40624 Closes elastic#40678
bizybot added a commit that referenced this issue Jun 13, 2019
#41710) (#43179)

Infra has fixed #10462 by installing `haveged` on CI workers.
This commit enables the disabled fixture and tests, and mounts
`/dev/urandom` for the container so there is enough
entropy required for kdc.
Note: hdfs-repository tests have been disabled, will raise a separate issue for it.

Closes #40624 Closes #40678
bizybot added a commit that referenced this issue Jun 13, 2019
#41710) (#43178)

Infra has fixed #10462 by installing `haveged` on CI workers.
This commit enables the disabled fixture and tests, and mounts
`/dev/urandom` for the container so there is enough
entropy required for kdc.
Note: hdfs-repository tests have been disabled, will raise a separate issue for it.

Closes #40624 Closes #40678
bizybot added a commit that referenced this issue Jun 13, 2019
#41710) (#43180)

Infra has fixed #10462 by installing `haveged` on CI workers.
This commit enables the disabled fixture and tests, and mounts
`/dev/urandom` for the container so there is enough
entropy required for kdc.
Note: hdfs-repository tests have been disabled, will raise a separate issue for it.

Closes #40624 Closes #40678
bizybot added a commit that referenced this issue Jun 13, 2019
#41710) (#43181)

Infra has fixed #10462 by installing `haveged` on CI workers.
This commit enables the disabled fixture and tests, and mounts
`/dev/urandom` for the container so there is enough
entropy required for kdc.
Note: hdfs-repository tests have been disabled, will raise a separate issue for it.

Closes #40624 Closes #40678
@alpar-t
Copy link
Contributor Author

alpar-t commented Jun 13, 2019

I think we also have something on the workers to generate entropy now, so we might not need to change our config at all just re-enable the fixture, sorry I missed it that this was never re-enabled. See the linked infra ticket for more information.

@mark-vieira mark-vieira added the Team:Delivery Meta label for Delivery team label Nov 11, 2020
@jakelandis jakelandis removed the v8.0.0 label Jul 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Delivery/Build Build or test infrastructure :Security/Security Security issues without another label Team:Delivery Meta label for Delivery team >test-failure Triaged test failures from CI v7.2.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants