-
Notifications
You must be signed in to change notification settings - Fork 143
PVC used by a job doesn't get resize after the pod of the job completed #175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
/sig storage |
We can open a PR if you guys agree with our proposal |
ping @gnufied @mauriciopoppe |
I'm not familiar with the external-resizer, @msau42 do you know who can help? |
Hey sorry I was on PTO last week. The proposal looks good. Please open a PR. |
I see that the fix is tagged for v1.4.0 release. However, v1.4.0 version requires CSI SPEC 1.5.0. Not all storage providers are using the new CSI SPEC 1.5.0. On the other hand, the bug also exists in external-resizer v1.2.0 and v1.3.0. Therefore, can we backport the fix to release-1.2, release-1.3 and release new versions v1.2.1, v1.3.1? |
The CSI spec is backward compatible and I don't recall we have made any core changes in features that affect volume expansion in CSI spec for awhile. So, a driver which does not implement 1.5.0 should still be able to run with external-resizer v1.4.0. I do not think there will be any issues or such. We could backport the fix and open it in older releases too if that makes it easier. |
@gnufied It would be great if we can backport the fix to release-1.2, release-1.3 and release new versions v1.2.1, v1.3.1. In our case, we are using external-resizer v1.2.0 with CSI Spec v1.2.0. Yeah, I understand that it could work with external-resizer v1.4.0 that uses CSI Spec 1.5.0, but it could also raise some unforeseen bugs that we are not fully aware of yet. I remembered we have some accidents with csi-provisioner before which makes us scared. |
Feel free to create backport request. I will make a release. |
Thanks @gnufied |
You will find release branches for those versions and you can create PRs against those branches. |
@gnufied Created the PRs. Please take a look. Thank you |
@gnufied Now that the backport PRs are merged, do you have ETA for when the patch release v1.2.1 and v1.3.1 are released? |
Summary:
We have a setup in which the
external-resizer
is used with the storage provider that only supports offline expansion (e.g., only supportsPluginCapability_VolumeExpansion_OFFLINE
). We deployed a job that uses a PVC provisioned by the storage provider. While the job pod is running, we resize the PVC by modifyingspec.resources.requests.storage
. The PVC cannot be resized while the pod is running as expected. However, after the job pod is completed, the PVC still doesn't get resized.external-resizer
doesn't send resizing gRPC call to the storage provider. The PVC is stuck in this state forever until we manually delete the job pod.Reproduce steps:
Deploy
external-resizer
together with a storage provider (we use Longhorn)Don't set the
--handle-volume-inuse-error
flag for theexternal-resizer
. It means that by default,external-resizer
will handle handle volume in use error in resizer controller, linkDeploy a job that uses a PVC as below. The job creates a pod that will sleep for 2 minutes and complete.
Click to open
While the job pod become running, try to expand the PVC by editing the
spec.resources.requests.storage
Observe that the resizing fail
Wait for the job pod to become completed.
Observer that that PVC stuck in the current state forever. It doesn't get resized because
external-resizer
doesn't attempt to make gRPC expanding call to the storage provider.Expected Behavior:
Once the job pod is completed, the PVC is no longer consider to be
in-used
. Thereforeexternal-resizer
should attempt to make gRPC expanding call to the storage provider.Propose:
We dig into the source code see that:
external-resizer
from retrying if the PVC has InUseErrors before AND it is in thectrl.usedPVCs
mapctrl.usedPVCs
map when a pod move tocompleted
phase. PVC is only removed when the pod is deleted, linkcompleted
. I.e.,:Evn:
external-resizer
v1.2.0The text was updated successfully, but these errors were encountered: