-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
node/sdn: make /var/lib/cni persistent to ensure IPAM allocations stick around across node restart #13231
Conversation
…ck around across node restart With the move to a CNI plugin, docker no longer handles IPAM, but CNI does through openshift-sdn's usage of the 'host-local' CNI IPAM plugin. That plugin stores IPAM allocations under /var/lib/cni/. If the node container gets restarted, without presreving /var/lib/cni, the IPs currently allocated to running pods get lost and on restart, openshift-sdn may allocate those IPs to new pods causing duplicate allocations. This never happened with docker because it has its own persistent IPAM store that does not get removed when docker restarts. Also because (historically) when docker restarted, all the containers died and the IP allocations were released by the daemon. Fix this by ensuring that IPAM allocations (which are tied to the life of the pod, *not* the life of the openshift-node process) persist even if the openshift-node process restarts. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1427789
Tested with containerized node in docker-in-docker; /var/lib/cni is preserved across 'docker restart origin/node' invocations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@sdodson I have a release-1.5 branch ready for PR too, do you want separate PRs or is one good enough? |
[merge] |
@dcbw yes, please open one for release-1.5 |
Evaluated for origin merge up to b0711a5 |
[Test]ing while waiting on the merge queue |
Evaluated for origin test up to b0711a5 |
continuous-integration/openshift-jenkins/test SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pull_requests_origin_future/823/) (Base Commit: 87597f4) |
continuous-integration/openshift-jenkins/merge SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pull_requests_origin_future/830/) (Base Commit: 3e26eed) (Image: devenv-rhel7_6036) |
With the move to a CNI plugin, docker no longer handles IPAM, but CNI does through
openshift-sdn's usage of the 'host-local' CNI IPAM plugin. That plugin stores
IPAM allocations under /var/lib/cni/.
If the node container gets restarted, without presreving /var/lib/cni, the IPs
currently allocated to running pods get lost and on restart, openshift-sdn
may allocate those IPs to new pods causing duplicate allocations.
This never happened with docker because it has its own persistent IPAM store that
does not get removed when docker restarts. Also because (historically) when docker
restarted, all the containers died and the IP allocations were released by the
daemon.
Fix this by ensuring that IPAM allocations (which are tied to the life of the pod,
not the life of the openshift-node process) persist even if the openshift-node
process restarts.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1427789
@sdodson @openshift/networking @eparis