Skip to content

k8s config tmp files cleared by centos #765

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
xxxxczxc opened this issue Feb 19, 2019 · 10 comments
Closed

k8s config tmp files cleared by centos #765

xxxxczxc opened this issue Feb 19, 2019 · 10 comments
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@xxxxczxc
Copy link

xxxxczxc commented Feb 19, 2019

k8s-client created some temp file in directory /tmp when process start, but centos will clear files that created over 10 days ago in this directory. After the temp files cleared, if my process was disconnected from k8s somehow, it will return 'No such file or directory' when call k8s interface.

MaxRetryError: HTTPSConnectionPool(host='xxx.xxx.xxx.xxx', port=xxxx): Max retries exceeded with url: /api/v1/namespaces/canary/resourcequotas/canary-compute-resources (Caused by SSLError(IOError(2, 'No such file or directory'),))

It seems like that kubernetes-client will read the temp file when try to reconnect to k8s.
So I call load_kube_config() when catched the Exception above, try to create temp again to recover, but return error.

File "/var/deploy/venv/lib/python2.7/site-packages/kubernetes/config/kube_config.py", line 334, in _load_cluster_info
file_base_path=self._config_base_path).as_file()
File "/var/deploy/venv/lib/python2.7/site-packages/kubernetes/config/kube_config.py", line 104, in as_file
raise ConfigException("File does not exists: %s" % self._file)
ConfigException: File does not exists: /tmp/tmpux9Zef

Dose kubernetes-client has an interface to recreate temp files or refresh temp files periodically?
Or is there any other way to solve this problem?

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 29, 2019
@tbarrella
Copy link

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 26, 2019
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 24, 2019
@tbarrella
Copy link

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 25, 2019
@pigletfly
Copy link

there is a discussion kubernetes-client/python-base#38

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 22, 2020
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 21, 2020
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

starlingx-github pushed a commit to starlingx/config that referenced this issue Jun 18, 2020
sysinv call the k8s python client to perform a number of operations.
The k8s python client creates temp files under /tmp and continues to
use these tmp files for the life-cycle of the processes.

However systemd-tmpfiles-clean.service will run every day to clean up
files in /tmp dir that are older than 10 days. If the k8s client code
is not triggered for more than 10 days (thus its temp files are not
accessed for more than 10 days), these temp files will be removed as
part of the cleanup. Certain sysinv operations then starts to fail with
an error that the tmp file is no longer there.

This is a known issue of kubernetes python client:
kubernetes-client/python#765

The commit fixes this issue by setting TMPDIR to /var/run/sysinv when
sm starts sysinv-conductor and sysinv-inv.

Change-Id: I365d637abd080bd03b65758e4e8db9203d6bfa4d
Closes-Bug: 1883599
Signed-off-by: Andy Ning <[email protected]>
starlingx-github pushed a commit to starlingx/config that referenced this issue Jun 18, 2020
sysinv call the k8s python client to perform a number of operations.
The k8s python client creates temp files under /tmp and continues to
use these tmp files for the life-cycle of the processes.

However systemd-tmpfiles-clean.service will run every day to clean up
files in /tmp dir that are older than 10 days. If the k8s client code
is not triggered for more than 10 days (thus its temp files are not
accessed for more than 10 days), these temp files will be removed as
part of the cleanup. Certain sysinv operations then starts to fail with
an error that the tmp file is no longer there.

This is a known issue of kubernetes python client:
kubernetes-client/python#765

The commit fixes this issue by setting TMPDIR to /var/run/sysinv_tmp
when sm starts sysinv-conductor and sysinv-inv.

Change-Id: I8544272b2431607ed1041473c5da2eecb64635af
Closes-Bug: 1883599
Signed-off-by: Andy Ning <[email protected]>
starlingx-github pushed a commit to starlingx/distcloud that referenced this issue Sep 25, 2020
dcmanager call the k8s python client to perform a number of
operations. The k8s python client creates temp files under /tmp and
continues use these tmp files for the life-cycle of the processes.

However systemd-tmpfiles-clean.service will run every day to clean up
files in /tmp dir that are older than 10 days. If the k8s client code
is not triggered for more than 10 days (thus its temp files are not
accessed for more than 10 days), these temp files will be removed as
part of the cleanup. Certain dcmanager operations then starts to fail
with an error that the tmp file is no longer there.

This is a known issue of kubernetes python client:
kubernetes-client/python#765

The commit fixes this issue by setting TMPDIR to /var/run/dcmanager
when sm starts dcmanager-manager.

Change-Id: Ib147c2ab26e303032e18da51a506e3768bc471e0
Closes-Bug: 1883599
Signed-off-by: Andy Ning <[email protected]>
starlingx-github pushed a commit to starlingx/nfv that referenced this issue Sep 25, 2020
nfv vim call the k8s python client to perform a number of
operations. The k8s python client creates temp files under /tmp and
continues use these tmp files for the life-cycle of the processes.

However systemd-tmpfiles-clean.service will run every day to clean up
files in /tmp dir that are older than 10 days. If the k8s client code
is not triggered for more than 10 days (thus its temp files are not
accessed for more than 10 days), these temp files will be removed as
part of the cleanup. Certain vim operations then starts to fail
with an error that the tmp file is no longer there.

This is a known issue of kubernetes python client:
kubernetes-client/python#765

The commit fixes this issue by setting TMPDIR to /var/run/nfv-vim
when sm starts vim.

Change-Id: I4f0544055e9d10ba2374e9fdb5133d767c1fa2c3
Closes-Bug: 1883599
Signed-off-by: Andy Ning <[email protected]>
@davidboybob
Copy link

/remove-lifecycle stale

starlingx-github pushed a commit to starlingx/config that referenced this issue Jul 20, 2021
Redirect the k8s python client's use of /tmp to /var/run/cert-mon_tmp
via setting TMPDIR

This is a known issue of kubernetes python client:
kubernetes-client/python#765

The fix is the same as for
https://bugs.launchpad.net/starlingx/+bug/1883599
See commit message there for more details.

Related-Bug: 1883599
Closes-Bug: 1936435

Signed-off-by: Kyle MacLeod <[email protected]>
Change-Id: I0e163bd1b4d5a19f07267dd4cd14bad1b8cb20bb
starlingx-github pushed a commit to starlingx/distcloud that referenced this issue May 17, 2024
dcmanager-orchestrator call the k8s python client to perform a
number of operations. The k8s python client creates temp files under
/tmp and continues use these tmp files for the life-cycle of the
processes.

However systemd-tmpfiles-clean.service will run every day to clean up
files in /tmp dir that are older than 10 days. If the k8s client code
is not triggered for more than 10 days (thus its temp files are not
accessed for more than 10 days), these temp files will be removed as
part of the cleanup. Certain dcmanager-orchestrator operations then
starts to fail with an error that the tmp file is no longer there.

This is a known issue of kubernetes python client:
kubernetes-client/python#765

The commit fixes this issue by setting TMPDIR to /var/run/dcmanager_
orchestrator_tmp when sm starts dcmanager-orchestrator.

The following similar commits were added for sysinv,dcmanager
services in the past
https://review.opendev.org/c/starlingx/config/+/736761
https://review.opendev.org/c/starlingx/distcloud/+/736247

Closes-bug: 2066048

Change-Id: I3d39f5b034e3ef2e6ad9636e86f26f0e93f16d45
Signed-off-by: amantri <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants