Skip to content

[baremetal] Add dispatcher script to statically configure DHCP address #2188

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 3, 2020

Conversation

cybertron
Copy link
Member

We have a feature request for nodes to be able to initially get
assigned their IP addresses via infinite DHCP leases, but then to
have those addresses statically configured on the node so that if
the DHCP server is unavailable for any reason they don't lose their
network configuration.

This adds a dispatcher script for NetworkManager that does so. It
looks at the DHCP assigned network settings and adds static
configuration with the same values.

- What I did

- How to verify it

- Description for the changelog

@cybertron
Copy link
Member Author

/hold

This is a PoC. If we decide to pursue this it will almost certainly need some refinement.

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 28, 2020
#!/bin/bash
set -ex

if [ ${DHCP4_DHCP_LEASE_TIME:-0} -lt 4294967295 ]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this not exist in IPv6?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only implemented ipv4 in this version. Based on my discussion with the NM team, I would probably argue that for this use case in ipv6 we should just use SLAAC addressing. If there's a reason we can't do that then we can always add support for DHCPv6 here too.

@cybertron
Copy link
Member Author

/hold cancel

I've tested this locally with infinite DHCP leases and it worked fine.

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 25, 2020
@celebdor
Copy link
Contributor

/retest

celebdor added a commit to celebdor/machine-config-operator that referenced this pull request Dec 1, 2020
This is essentially an ipv6 version of openshift#2188.

Co-authored-by: Antoni Segura Puimedon <[email protected]>
@cybertron
Copy link
Member Author

/hold

We discovered that this would cause problems on reboot. Fix coming.

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 1, 2020
@cybertron cybertron force-pushed the static-from-dhcp branch 2 times, most recently from 8c8441f to c8ed64a Compare December 2, 2020 03:41
cybertron added a commit to cybertron/machine-config-operator that referenced this pull request Dec 2, 2020
This is essentially an ipv6 version of openshift#2188.

Co-authored-by: Antoni Segura Puimedon <[email protected]>
@cybertron cybertron force-pushed the static-from-dhcp branch 2 times, most recently from a78db50 to 0551cd9 Compare December 2, 2020 18:53
@cybertron cybertron force-pushed the static-from-dhcp branch 2 times, most recently from dceaf9f to e905d8c Compare December 2, 2020 20:40
@celebdor
Copy link
Contributor

celebdor commented Dec 3, 2020

Just tested it with a small modification on one of my masters with openshift-sdn. It worked well. Here some snippets:

mode: 0755
path: "/etc/NetworkManager/dispatcher.d/30-static-dhcp"
contents:
  inline: |
    #!/bin/bash
    set -ex -o pipefail
    if [[ "{{ .NetworkType }}" == "OVNKubernetes" && "$CONNECTION_ID" == "Wired Connection" ]]
    then
        >&2 echo "Refusing to modify default connection."
        exit 0
    fi

    if [ -z ${DHCP4_IP_ADDRESS:-} ]; then
        >&2 echo "Not a DHCP4 address. Ignoring."
        exit 0
    fi
    if [ ${DHCP4_DHCP_LEASE_TIME:-0} -lt 4294967295 ]; then
        >&2 echo "Not an infinite DHCP4 lease. Ignoring."
        exit 0
    fi
    IPS=($IP4_ADDRESS_0)
    CIDR=${IPS[0]}
    GATEWAY=${IPS[1]}
    TYPE=$(nmcli --get-values connection.type connection show "$CONNECTION_ID")
    if ! nmcli con show inf-lease-to-static
    then
        nmcli con add type "$TYPE" con-name inf-lease-to-static
    fi
    nmcli con mod inf-lease-to-static \
      conn.interface "$1" \
      connection.autoconnect yes \
      ipv4.addresses "$CIDR" \
      ipv4.method manual \
      ipv4.gateway "$GATEWAY" \
      ipv4.dns "$IP4_NAMESERVERS"
    if [ -n "$IP4_DOMAINS" ]; then
        nmcli con mod inf-lease-to-static ipv4.dns-search "$IP4_DOMAINS"
    fi
    plus=''
    for i in $(seq 0 $(($IP4_NUM_ROUTES-1)) )
    do
        varname="IP4_ROUTE_$i"
        nmcli con mod inf-lease-to-static ${plus}ipv4.routes "${!varname}"
        plus='+'
    done
    nmcli con up inf-lease-to-static
    # Copy it from the OverlayFS mount to the persistent lowerdir
    cp "/etc/NetworkManager/system-connections-merged/inf-lease-to-static.nmconnection" /etc/NetworkManager/system-connections

    if [ -n "${DHCP4_HOST_NAME:-}" ]
    then
        hostnamectl set-hostname --static --transient "$DHCP4_HOST_NAME"
    fi

After making these changes, that make sure that it works for openshift SDN and for OVN kubernetes, you can see in the following image that there is a reboot of master-0 (I also rebooted master-1) after I had killed the dnsmasq that provided DHCP functionality.
You can also see that the nodes boot up with the hostname they are supposed and become ready again.

image

We have a feature request for nodes to be able to initially get
assigned their IP addresses via infinite DHCP leases, but then to
have those addresses statically configured on the node so that if
the DHCP server is unavailable for any reason they don't lose their
network configuration.

This adds a dispatcher script for NetworkManager that does so. It
looks at the DHCP assigned network settings and adds static
configuration with the same values.
@cybertron
Copy link
Member Author

/hold cancel

I also had success rebooting nodes with the latest version of this. I think it's good to go.

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 3, 2020
@celebdor
Copy link
Contributor

celebdor commented Dec 3, 2020

/lgtm

The version in this PR now matches what I tested successfully in #2188 (comment)

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Dec 3, 2020
@sinnykumari
Copy link
Contributor

overall lgtm.
/approve

@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: celebdor, cybertron, sinnykumari

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 3, 2020
@sinnykumari
Copy link
Contributor

/retest

@celebdor
Copy link
Contributor

celebdor commented Dec 3, 2020

/retest

@openshift-merge-robot openshift-merge-robot merged commit 4c331f2 into openshift:master Dec 3, 2020
JM1 added a commit to JM1/openshift-machine-config-operator that referenced this pull request Apr 1, 2025
NetworkManager dispatcher scripts 30-static-dhcp and 30-static-dhcpv6
modify the NetworkManager connection of the OVS interface ovs-if-br-ex
in case of a infinite DHCP lease: They will change ipv4.method and
ipv6.method from auto to manual in order to statically assign DHCP
addresses etc.

The purpose of this feature was to enable nodes to keep their network
configuration and network connectivity when their DHCP server becomes
unavailable for any reason [0]. For example, this allowed users to
deploy OpenShift nodes with "static" ip addresses, i.e. ip addresses
with infitite DHCP leases, using an initial DHCP server that would then
be removed once the cluster is up.

However, the changes by the dispatcher scripts to NM connection
ovs-if-br-ex were saved to disk which broke the rollback behaviour of
configure-ovs.sh (ovs-configuration.service):
OVN-Kubernetes configuration is rolled back and (re)generated on every
boot, to take in account any changes that have possibly been applied in
the standard configuration sources.

Nowadays, OpenShift has proper support for static ip addresses built-in
and this inf-lease-to-static feature has not been documented since 4.9.
Both dispatcher scripts were only kept due to the "no breaking changes"
policy in OpenShift. However, with configure-ovs.sh's rollback feature
being broken, those dispatcher scripts could finally be removed.

[0] openshift#2188
[1] openshift#4954
JM1 added a commit to JM1/openshift-machine-config-operator that referenced this pull request Apr 1, 2025
NetworkManager dispatcher scripts 30-static-dhcp and 30-static-dhcpv6
modify the NetworkManager connection of the OVS interface ovs-if-br-ex
in case of a infinite DHCP lease: They will change ipv4.method and
ipv6.method from auto to manual in order to statically assign DHCP
addresses etc.

The purpose of this feature was to enable nodes to keep their network
configuration and network connectivity when their DHCP server becomes
unavailable for any reason [0]. For example, this allowed users to
deploy OpenShift nodes with "static" ip addresses, i.e. ip addresses
with infitite DHCP leases, using an initial DHCP server that would then
be removed once the cluster is up.

However, the changes by the dispatcher scripts to NM connection
ovs-if-br-ex were saved to disk which conflicts with the rollback
behaviour of configure-ovs.sh (ovs-configuration.service):
OVN-Kubernetes configuration is rolled back and (re)generated on every
boot, to take in account any changes that have possibly been applied in
the standard configuration sources. In OpenShift releases without
PR openshift#4877 [2], the NM connection file for ovs-if-br-ex in
/etc/NetworkManager/system-connections will not be removed during
rollback which causes conflicts when changes have been applied in the
standard configuration sources. With PR openshift#4877 [2], the NM connection
file is removed during rollback, defeating the purpose of this feature.

Nowadays, OpenShift has proper support for static ip addresses built-in
and this inf-lease-to-static feature has not been documented since 4.9.
Both dispatcher scripts were only kept due to the "no breaking changes"
policy in OpenShift. However, because of the conflict with
configure-ovs.sh's rollback feature, those dispatcher scripts could
finally be removed.

[0] openshift#2188
[1] openshift#4954
[2] openshift#4877
JM1 added a commit to JM1/openshift-machine-config-operator that referenced this pull request Apr 1, 2025
NetworkManager dispatcher scripts 30-static-dhcp and 30-static-dhcpv6
modify the NetworkManager connection of the OVS interface ovs-if-br-ex
in case of a infinite DHCP lease: They will change ipv4.method and
ipv6.method from auto to manual in order to statically assign DHCP
addresses etc.

The purpose of this feature was to enable nodes to keep their network
configuration and network connectivity when their DHCP server becomes
unavailable for any reason [0]. For example, this allowed users to
deploy OpenShift nodes with "static" ip addresses, i.e. ip addresses
with infitite DHCP leases, using an initial DHCP server that would then
be removed once the cluster is up.

However, the changes by the dispatcher scripts to NM connection
ovs-if-br-ex were saved to disk which conflicts with the rollback
behaviour of configure-ovs.sh (ovs-configuration.service):
OVN-Kubernetes configuration is rolled back and (re)generated on every
boot, to take in account any changes that have possibly been applied in
the standard configuration sources. In OpenShift releases without
PR openshift#4877 [2], the NM connection file for ovs-if-br-ex in
/etc/NetworkManager/system-connections will not be removed during
rollback which causes conflicts when changes have been applied in the
standard configuration sources. With PR openshift#4877 [2], the NM connection
file is removed during rollback, defeating the purpose of this feature.

Nowadays, OpenShift has proper support for static ip addresses built-in
and this inf-lease-to-static feature has not been documented since 4.9.
Both dispatcher scripts were only kept due to the "no breaking changes"
policy in OpenShift. However, because of the conflict with
configure-ovs.sh's rollback feature, those dispatcher scripts could
finally be removed.

[0] openshift#2188
[1] openshift#4954
[2] openshift#4877
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants