Skip to content

Update snapshot controller #201

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Dec 25, 2019

Conversation

xing-yang
Copy link
Collaborator

What type of PR is this?
/kind cleanup

What this PR does / why we need it:
This PR updates the snapshot controller logic.

Which issue(s) this PR fixes:
Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NONE

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note-none Denotes a PR that doesn't merit a release note. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Dec 2, 2019
@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Dec 2, 2019
@xing-yang xing-yang force-pushed the controller_update branch 2 times, most recently from f7815c5 to 582802f Compare December 11, 2019 20:11
@xing-yang xing-yang changed the title WIP: Update snapshot controller Update snapshot controller Dec 11, 2019
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 11, 2019
@xing-yang
Copy link
Collaborator Author

/assign @yuxiangqian

Copy link
Contributor

@yuxiangqian yuxiangqian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WIP

@@ -1193,3 +1218,48 @@ func (ctrl *csiSnapshotCommonController) contentExists(snapshot *crdv1.VolumeSna
// Found in content cache store and convert object is successful
return content, nil
}

func (ctrl *csiSnapshotCommonController) snapshotExists(snapshotName string) (*crdv1.VolumeSnapshot, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe renames it to getSnapshot to align with the return values?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change it to getSnapshotFromStore.

if snapshot != nil && snapshot.UID != content.Spec.VolumeSnapshotRef.UID {
// The snapshot that the content was pointing to was deleted, and another
// with the same name created.
klog.V(4).Infof("synchronizing VolumeSnapshotContent[%s]: content %s has different UID, the old one must have been deleted", content.Name, snapshotName)
// Treat the content as bound to a missing snapshot.
snapshot = nil
} else {
// TODO(xyang): Write a function to check if snapshot.Status is different from content.Status and add snapshot to queue if there is a difference and it is worth triggering an snapshot status update
// Check if content status is set to true and update snapshot status if so
if snapshot != nil && content.Status != nil && content.Status.ReadyToUse != nil && *content.Status.ReadyToUse == true {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the status checking needs to be in-place before merged? otherwise it will trigger unnecessary item for syncSnapshot

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, will fix soon.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

return err
}
klog.V(5).Infof("syncContent: volume snapshot content %+v", content)
_, err = ctrl.setAnnVolumeSnapshotBeingDeleted(content)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, i would not return a new content object if the cache has been handled by setAnnVolumeSnapshotBeingDeleted call.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

func (ctrl *csiSnapshotCommonController) processIfDeletionTimestampSet(snapshot *crdv1.VolumeSnapshot) error {
klog.V(5).Infof("processIfDeletionTimestampSet VolumeSnapshot[%s]: %s", utils.SnapshotKey(snapshot), utils.GetSnapshotStatusForLogging(snapshot))

content, deleteContent, _, err := ctrl.checkContentAndBoundStatus(snapshot)
if err != nil {
return err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could be an issue if no content is found? it should proceed to remove any finalizers?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"no content is found" will not return err. I need to add better comments here.

func (ctrl *csiSnapshotCommonController) getContentFromStore(snapshot *crdv1.VolumeSnapshot) (*crdv1.VolumeSnapshotContent, error) {
        ......
        obj, found, err := ctrl.contentStore.GetByKey(contentName)
        if err != nil {
                return nil, err
        }
        // Not in the content cache store, no error
        if !found {
                return nil, nil
        }

@@ -241,7 +238,7 @@ func (ctrl *csiSnapshotCommonController) checkandRemoveSnapshotFinalizersAndChec
// Check if a volume is being created from snapshot.
inUse := ctrl.isVolumeBeingCreatedFromSnapshot(snapshot)

klog.V(5).Infof("syncSnapshot[%s]: set DeletionTimeStamp on content.", utils.SnapshotKey(snapshot))
klog.V(5).Infof("checkandRemoveSnapshotFinalizersAndCheckandDeleteContent[%s]: set DeletionTimeStamp on content.", utils.SnapshotKey(snapshot))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is checkandRemoveSnapshotFinalizersAndCheckandDeleteContent still needed? "IsSnapshotDeletionCandidate" already checks the finalizer pv controller put on a snapshot?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"IsSnapshotDeletionCandidate" only checks if deletionTimeStamp is set on VolumeSnapshot and if there are finalizers on VolumeSnapshot.
checkandRemoveSnapshotFinalizersAndCheckandDeleteContent calls IsSnapshotDeletionCandidate, and then checks whether those finalizers can be removed by checking if a snapshot is bound to a content and if a volume is being created from the snapshot. If snapshot is not bound and not used by a volume, finalizers will be removed.

// processIfDeletionTimestampSet processes finalizers and deletes the content when appropriate
// It checks if contents exists, it checks if snapshot has bi-directional binding, it checks if
// finalizers should be removed, and it checks if content should be deleted and deletes it if needed.
func (ctrl *csiSnapshotCommonController) processIfDeletionTimestampSet(snapshot *crdv1.VolumeSnapshot) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking about this corner case for deletion, following is the condition:

  1. snapshot deletion timestamp set
  2. snapshot content has not be created yet, checkContentAndBoundStatus returns nil on content and error
  3. a pvc is using the snapshot to restore. checkandRemoveSnapshotFinalizersAndCheckandDeleteContent will return nil
  4. checkandRemovePVCFinalizer will remove source PVC finalizer!!

in this case, 4th step should be held?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If snapshot content has not been created yet, user can't create a PVC from the snapshot. The provisioner will check whether snapshot and content has bi-directional binding and snapshot status readyToUse is true before proceeding to create PVC from snapshot.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was actually not talking about restoring, I was talking about PVC source finalizer.
But I guess it should be fine implementing this way as well. basically it means, if a content object has not been created yet, and snapshot is being deleted,
it's fine to remove the source PVC finalizer, no content would be ever created for that snapshot

func NeedToAddSnapshotAsSourceFinalizer(snapshot *crdv1.VolumeSnapshot) bool {
return snapshot.ObjectMeta.DeletionTimestamp == nil && !slice.ContainsString(snapshot.ObjectMeta.Finalizers, VolumeSnapshotAsSourceFinalizer, nil)
}

// needToAddSnapshotBoundFinalizer checks if a Finalizer needs to be added for the bound volume snapshot.
// NeedToAddSnapshotBoundFinalizer checks if a Finalizer needs to be added for the bound volume snapshot.
func NeedToAddSnapshotBoundFinalizer(snapshot *crdv1.VolumeSnapshot) bool {
return snapshot.ObjectMeta.DeletionTimestamp == nil && !slice.ContainsString(snapshot.ObjectMeta.Finalizers, VolumeSnapshotBoundFinalizer, nil) && snapshot.Status != nil && snapshot.Status.BoundVolumeSnapshotContentName != nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, status check seem to be redundant? or maybe just call "IsBoundVolumeSnapshotContentNameSet"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Will change.

// 1) snapshot.Status is nil
// 2) snapshot.Status.ReadyToUse is false
// 3) snapshot.Status.BoundVolumeSnapshotContentName is not set
if !utils.IsSnapshotReady(snapshot) || !utils.IsBoundVolumeSnapshotContentNameSet(snapshot) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, is IsBoundVolumeSnapshotContentNameSet needed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should update status and continue to sync up if BoundVolumeSnapshotContentNameSet is not set.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but in what case "IsSnapshotReady" will return true with "BoundVolumeSnapshotContentNameSet" not set? it seems to be not possible?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, usually it should not happen. This is to ensure that for whatever reason this BoundVolumeSnapshotContentName field got removed, we still try to update snapshot status and bring it back.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I guess that was your thought as well. Just it seems to be very unlikely, if not impossible, that only the BoundVolumeSnapshotContentName field is lost, if lost, rather the whole status is lost?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PTAL, I still think the check is not quite necessary.

// 2) snapshot.Status.ReadyToUse is false
// 3) snapshot.Status.BoundVolumeSnapshotContentName is not set
if !utils.IsSnapshotReady(snapshot) || !utils.IsBoundVolumeSnapshotContentNameSet(snapshot) {
return ctrl.syncUnreadySnapshot(snapshot)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in syncUnreadySnapshot, for the case where "else if snapshot.Status != nil && snapshot.Status.BoundVolumeSnapshotContentName != nil"
when could that scenario happen?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That will be divided into several cases below:

  1. If "snapshot.Status != nil && snapshot.Status.BoundVolumeSnapshotContentName != nil && snapshot.Status.Ready == nil", call syncUnreadySnapshot().
  2. If "snapshot.Status != nil && snapshot.Status.BoundVolumeSnapshotContentName != nil && snapshot.Status.Ready != nil && *snapshot.Status.Ready == false", call syncUnreadySnapshot().
  3. If "snapshot.Status != nil && snapshot.Status.BoundVolumeSnapshotContentName != nil && snapshot.Status.Ready != nil && *snapshot.Status.Ready == true", call syncReadySnapshot().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

discussed around this offline, we might be able to merge this case with the first case, I will leave it to @xing-yang to think further :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at getMatchSnapshotContent(), it is actually not searching in the API server. It is searching in ctrl.contentStore.List() which is the cache store.

I agree that ctrl.contentStore.GetByKey() is still more efficient, so I made the fix.

}

// syncReadySnapshot checks the snapshot which has been bound to snapshot content successfully before.
// If there is any problem with the binding (e.g., snapshot points to a non-exist snapshot content), update the snapshot status and emit event.
// If there is any problem with the binding (e.g., snapshot points to a non-existent snapshot content), update the snapshot status and emit event.
func (ctrl *csiSnapshotCommonController) syncReadySnapshot(snapshot *crdv1.VolumeSnapshot) error {
if !utils.IsBoundVolumeSnapshotContentNameSet(snapshot) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this check necessary? or if it's needed, it rather should be an error instead of returning nil directly?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this should be an error. Will fix it.

Copy link
Contributor

@yuxiangqian yuxiangqian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

per discussion on "getMatchSnapshotContent" call, since it's a non-efficient call and should be only needed in the following situations:

  1. content created but failed to update snapshot status (boundSnapshotContentName == nil) when controller crash or API server error
  2. status of snapshot lost
    propose the following steps to reduce the call to "getMatchSnapshotContent"
  3. based on the naming scheme for snapshot content, since this is only the dynamic provisioning case, call "util.GetSnapshotContentNameForSnapshot" to get the target content name (or use snapshot.Status.BoundSnapshotContentName if not nil?)
  4. use the name to search in local content cache
  5. If not found (it's also possible the controller failed to update local cache after content creation), then use the name to talk to API server to fetch
  6. continue bind/update status if found, otherwise it should fail into creation workflow.

// 1) snapshot.Status is nil
// 2) snapshot.Status.ReadyToUse is false
// 3) snapshot.Status.BoundVolumeSnapshotContentName is not set
if !utils.IsSnapshotReady(snapshot) || !utils.IsBoundVolumeSnapshotContentNameSet(snapshot) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but in what case "IsSnapshotReady" will return true with "BoundVolumeSnapshotContentNameSet" not set? it seems to be not possible?

Copy link
Contributor

@yuxiangqian yuxiangqian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some nits.
LGTM for the rest and happy holidays

klog.V(5).Infof("syncContent: volume snapshot content %+v", content)
// due to the finalizer.
if snapshot != nil && utils.IsSnapshotDeletionCandidate(snapshot) {
err = ctrl.setAnnVolumeSnapshotBeingDeleted(content)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, just "return ctrl.setAnnVolumeSnapshotBeingDeleted(content)" will do the trick?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

if snapshot.Status == nil && content.Status != nil {
return true
}
if snapshot.Status != nil && content.Status == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, these two if statements can be one:
if content.Status == nil return false

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

if snapshot.Status.ReadyToUse != nil && content.Status.ReadyToUse != nil && snapshot.Status.ReadyToUse != content.Status.ReadyToUse {
return true
}
if (snapshot.Status.RestoreSize == nil && content.Status.RestoreSize != nil) || (snapshot.Status.RestoreSize != nil && snapshot.Status.RestoreSize.IsZero() && content.Status.RestoreSize != nil && *content.Status.RestoreSize > 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, prefer it to follow the pattern as status.ReadyToUse to break the OR condition into a separate if.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// If content is deleted already, remove SnapshotBound finalizer
content, err := ctrl.contentExists(snapshot)
content, err := ctrl.getContentFromStore(snapshot)
if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, changing to:
if err != nil || content == nil {
return nil, false, false, err
}
will save following content checks?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -241,7 +246,7 @@ func (ctrl *csiSnapshotCommonController) checkandRemoveSnapshotFinalizersAndChec
// Check if a volume is being created from snapshot.
inUse := ctrl.isVolumeBeingCreatedFromSnapshot(snapshot)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as "isVolumeBeingCreatedFromSnapshot" could be an expensive call, I suggested to check content == nil firstly.
If content == nil, then inUse could be saved and finalizer can be removed directly. That could be a very common case when a
user creates a snapshot wrongly and there is no content created for it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// processIfDeletionTimestampSet processes finalizers and deletes the content when appropriate
// It checks if contents exists, it checks if snapshot has bi-directional binding, it checks if
// finalizers should be removed, and it checks if content should be deleted and deletes it if needed.
func (ctrl *csiSnapshotCommonController) processIfDeletionTimestampSet(snapshot *crdv1.VolumeSnapshot) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was actually not talking about restoring, I was talking about PVC source finalizer.
But I guess it should be fine implementing this way as well. basically it means, if a content object has not been created yet, and snapshot is being deleted,
it's fine to remove the source PVC finalizer, no content would be ever created for that snapshot

@xing-yang
Copy link
Collaborator Author

@yuxiangqian addressed your comments. PTAL.

@@ -85,78 +85,68 @@ func (ctrl *csiSnapshotCommonController) syncContent(content *crdv1.VolumeSnapsh

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this log could be removed because of line 88?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

@@ -892,6 +893,39 @@ func (ctrl *csiSnapshotCommonController) bindandUpdateVolumeSnapshot(snapshotCon
return snapshotCopy, nil
}

// needsUpdateSnapshotStatus compares snapshot status with the content status and decide
// if snapshot status needs to be updated based on content status
func (ctrl *csiSnapshotCommonController) needsUpdateSnapshotStatus(snapshot *crdv1.VolumeSnapshot, content *crdv1.VolumeSnapshotContent) bool {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the logic here could be simplified. The purpose is to to snapshot into the queue so that snapshot status could be updated. During sync snapshot, it will check whether and how to update snapshot status based on content status. E.g., if content.status is nil, no update is needed. Don't need to compare status here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we are trying to avoid triggering syncSnapshot if not necessary.

if utils.NeedToAddContentFinalizer(content) {
// Content is not being deleted -> it should have the finalizer.
klog.V(5).Infof("syncContent: Add Finalizer for VolumeSnapshotContent[%s]", content.Name)
return ctrl.addContentFinalizer(content)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why return here?

  1. for dynamic case, finalizer should be already added during creation
  2. for static case, finalizer is added here, if return immediately, snapshot status will not be updated from content status?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think adding finalizer should trigger content update event, and thenthe content will be re-queue'd to be processed again, that would be when status of snapshot gets updated?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just let it continue to update status after adding finalizer?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

certainly could update the status of volume snapshot, just syncContent does not really update snapshot status, samething at line 121, it just put an item into the work queue. I actually prefer to follow the same pattern here and let syncSnapshot handle it.
One thing was not clear to me though, is if there is no VolumeSnapshotContentFinalizer on the content, i.e., NeedToAddContentFinalizer returns false, AND there is no snapshot object, should the finalizer be added?
maybe move this if block after line 121?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there reason why is, if the finalizer is added to the content object regardless, it will make deletion of an orphaned content not easy?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, I meant to add snapshot to the queue to trigger update. With current logic, it takes a little bit longer to wait for controller to process content from the queue again and add snapshot into the queue. Both should work, so I don't have strong objection about it.

  1. sync content, add content finalizer,
  2. content added to the queue back,
  3. sync content again, it goes to check snapshot exist logic and put snapshot into the queue.

klog.V(5).Infof("syncContent: volume snapshot content %+v", content)
}
// due to the finalizer.
if snapshot != nil && utils.IsSnapshotDeletionCandidate(snapshot) {
Copy link
Collaborator

@jingxu97 jingxu97 Dec 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the purpose of IsSnapshotDeletionCandidate?

  1. If snapshot has finalizer (e.g., snapshotInUser), it will set annotation?
  2. if snapshot has no finalizer, it will not set the annotation?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It checks if deletionTimestamp is set and finalizer is also set. If true, it sets the annotation. Then the sidecar controller will check if it is safe to delete the content finalizer and delete the content, etc.
This follows the pattern of the finalizers in the in-tree pv/pvc protection controller. It makes sure that it tries to add finalizer first to protection the resource before trying to delete it.


klog.V(5).Infof("processFinalizersAndCheckandDeleteContent[%s]: delete snapshot content and remove finalizer from snapshot if needed", utils.SnapshotKey(snapshot))
err = ctrl.checkandRemoveSnapshotFinalizersAndCheckandDeleteContent(snapshot, content, deleteContent)
// processIfDeletionTimestampSet processes finalizers and deletes the content when appropriate
Copy link
Collaborator

@jingxu97 jingxu97 Dec 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

processIfDeletionTimestampSet --> syncDeletingSnapshot?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to processSnapshotWithDeletionTimestamp. "syncDeletingSnapshot" would be confused with other functions such as syncSnapshot.

func (ctrl *csiSnapshotCommonController) processFinalizersAndCheckandDeleteContent(snapshot *crdv1.VolumeSnapshot) error {
klog.V(5).Infof("processFinalizersAndCheckandDeleteContent VolumeSnapshot[%s]: %s", utils.SnapshotKey(snapshot), utils.GetSnapshotStatusForLogging(snapshot))

func (ctrl *csiSnapshotCommonController) checkContentAndBoundStatus(snapshot *crdv1.VolumeSnapshot) (*crdv1.VolumeSnapshotContent, bool, bool, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add comments about this function

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

// processIfDeletionTimestampSet processes finalizers and deletes the content when appropriate
// It checks if contents exists, it checks if snapshot has bi-directional binding, it checks if
// finalizers should be removed, and it checks if content should be deleted and deletes it if needed.
func (ctrl *csiSnapshotCommonController) processIfDeletionTimestampSet(snapshot *crdv1.VolumeSnapshot) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the logic implemented here is confusing. There are several layers of calls. Is the following logic sufficient?

if snapshot.DeletionTimeStamp is set (it indicates it must has finalizer)

  1. if snapshotAsSource finalizer is set
    -- (double check pvc is still using it) -- do nothing
    -- if pvc finishes restoring, remove finalizer
  2. if SnapshotBound finalizer is set
    -- content exist, set deletiontimestamp on content (can also set annotation at the same time here)
    -- context does not exist, delete finalizer

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are 3 steps in processIfDeletionTimestampSet (renamed to processSnapshotWithDeletionTimestamp):

  1. Call a helper function checkContentAndBoundStatus() to check content and bound status.
  2. Call checkandRemoveSnapshotFinalizersAndCheckandDeleteContent() with information obtained from step 1. This function name is very long but the name suggests what it does. It determines whether to remove finalizers on snapshot and whether to delete content.
  3. Call checkandRemovePVCFinalizer() to determine whether to remove the finalizer on PVC.

I added comments to explain that.

@@ -266,7 +270,12 @@ func (ctrl *csiSnapshotCommonController) checkandRemoveSnapshotFinalizersAndChec
}

// checkandAddSnapshotFinalizers checks and adds snapshot finailzers when needed
func (ctrl *csiSnapshotCommonController) checkandAddSnapshotFinalizers(snapshot *crdv1.VolumeSnapshot, snapshotBound bool, deleteContent bool) {
func (ctrl *csiSnapshotCommonController) checkandAddSnapshotFinalizers(snapshot *crdv1.VolumeSnapshot) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the logic is to addSnapshotAsSourceFinalizer no matter whether there is PVC is currently using snapshot. I think it should be fine as long as we have correct logic to remove it when needed. But worth to put a note about it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added note before NeedToAddSnapshotAsSourceFinalizer().

@@ -171,53 +158,68 @@ func (ctrl *csiSnapshotCommonController) syncContent(content *crdv1.VolumeSnapsh
func (ctrl *csiSnapshotCommonController) syncSnapshot(snapshot *crdv1.VolumeSnapshot) error {
Copy link
Collaborator

@jingxu97 jingxu97 Dec 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel the logic inside this function is kind of hard to follow. Trying to see whether the following top level functions (we can work out details of functions later) can help.

func (ctrl *csiSnapshotCommonController) syncSnapshot(snapshot *crdv1.VolumeSnapshot) error {
	klog.V(5).Infof("synchronizing VolumeSnapshot[%s]: %s", utils.SnapshotKey(snapshot), utils.GetSnapshotStatusForLogging(snapshot))

	if snapshot.ObjectMeta.DeletionTimestamp != nil {
		return ctrl.syncDeletingSnapshot(snapshot)

	} else {
		if snapshot.Status == nil {
			return ctrl.syncUnBoundSnapshot(snapshot)
		} else {
			return ctrl.syncBoundSnapshot(snapshot)
		}
	}
	return nil
}

func (ctrl *csiSnapshotCommonController) syncUnBoundSnapshot(snapshot *crdv1.VolumeSnapshot) error {
	content, err := ctrl.getContentFromStore(snapshot)
	if err != nil {
		return err
	}
	
	if content != nil {
		addBoundFinalizer(snapshot)
		addSnapshotAsSourceFinalizer(snapshot)
		return updateSnapshotFromContent(snapshot, content)
	} else {
		if isStaticBinding(snapshot) {
			return nil
		}
		addBoundFinalizer(snapshot)
		return createSnapshotContent(snapshot)
	}

}

func (ctrl *csiSnapshotCommonController) syncBoundSnapshot(snapshot *crdv1.VolumeSnapshot) error {
	content, err := ctrl.getContentFromStore(snapshot)
	if err != nil {
		return err
	}
	
	if content == nil {
		return updateSnapshotWithNoContentError(snapshot)
	} else {
		return updateSnapshotFromContent(snapshot, content)
	}
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially thought about splitting syncSnapshot by whether snapshot.Status is nil or not, but I found that it will make the logic more complicated to follow. If snapshot.Status != nil but it is empty or it is not empty but only with boundVolumeSnapshotContentName set but readyToUse is nil or false, and assume content != nil, we'll go ahead to update snapshot status from content status. However content status could also be nil. So we'll have to return without any update on snapshot status and wait for the next sync. Eventually we'll get there but it will take longer. It will also have duplicate logic to update status spreading in both syncSnapshotWithStatus and syncSnapshotWithoutStatus.

So I think the current logic is more straight-forward.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's open an issue to resolve this, I think the logic could be simplified, may deserve an issue to track separately.

// 1) snapshot.Status is nil
// 2) snapshot.Status.ReadyToUse is false
// 3) snapshot.Status.BoundVolumeSnapshotContentName is not set
if !utils.IsSnapshotReady(snapshot) || !utils.IsBoundVolumeSnapshotContentNameSet(snapshot) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PTAL, I still think the check is not quite necessary.

addSourceFinalizer := false
addBoundFinalizer := false
// NOTE: Source finalizer will be added to snapshot if
// DeletionTimeStamp is nil and it is not set yet.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, "it is not set yet" means what has not been set yet, the deletion time stamp?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It refers to the source finalizer. I'll just remove the "and it is not set yet" part.

// isVolumeBoundToClaim returns true, if given volume is pre-bound or bound
// to specific claim. Both claim.Name and claim.Namespace must be equal.
// If claim.UID is present in volume.Spec.ClaimRef, it must be equal too.
func (ctrl *csiSnapshotCommonController) isVolumeBoundToClaim(volume *v1.PersistentVolume, claim *v1.PersistentVolumeClaim) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missed this call in previous rounds
I think this needs to be more restricted, i.e., if volume.Spec.ClaimRef.UID == "", it should rather . return false? the binding has not been done yet, this call is used to obtain volume information, in the case when bi-directional binding is not complete yet, it should not return true.

Copy link
Collaborator Author

@xing-yang xing-yang Dec 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is strictly from the checking pv/pvc binding fix. If it is an issue, let's take a closer look and fix it separately and backport to previous releases.

@yuxiangqian
Copy link
Contributor

LGTM, please create issues to track unresolved concerns.

@yuxiangqian
Copy link
Contributor

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 25, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: xing-yang, yuxiangqian

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [xing-yang,yuxiangqian]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 8179759 into kubernetes-csi:master Dec 25, 2019
@xing-yang
Copy link
Collaborator Author

LGTM, please create issues to track unresolved concerns.

Thanks! I'll open issues.

xing-yang added a commit to xing-yang/external-snapshotter that referenced this pull request Aug 23, 2022
d24254f Merge pull request kubernetes-csi#202 from xing-yang/kind_0.14.0
0faa3fc Update to Kind v0.14.0 images
ef4e1b2 Merge pull request kubernetes-csi#201 from xing-yang/add_1.24_image
4ddce25 Add 1.24 Kind image
7fe5149 Merge pull request kubernetes-csi#200 from pohly/bump-kubernetes-version
70915a8 prow.sh: update snapshotter version
31a3f38 Merge pull request kubernetes-csi#199 from pohly/bump-kubernetes-version
7577454 prow.sh: bump Kubernetes to v1.22.0

git-subtree-dir: release-tools
git-subtree-split: d24254f
@xing-yang xing-yang mentioned this pull request Aug 23, 2022
mowangdk added a commit to mowangdk/external-snapshotter that referenced this pull request Aug 6, 2023
670bb0e Merge pull request kubernetes-csi#229 from marosset/fix-codespell-errors
35d5e78 Merge pull request kubernetes-csi#219 from yashsingh74/update-registry
63473cc Merge pull request kubernetes-csi#231 from coulof/bump-go-version-1.20.5
29a5c76 Merge pull request kubernetes-csi#228 from mowangdk/chore/adopt_kubernetes_recommand_labels
8dd2821 Update cloudbuild image with go 1.20.5
1df23db Merge pull request kubernetes-csi#230 from msau42/prow
1f92b7e Add ginkgo timeout to e2e tests to help catch any stuck tests
2b8b80e fixing some codespell errors
c10b678 Merge pull request kubernetes-csi#227 from coulof/check-sidecar-supported-versions
72984ec chore: adopt kubernetes recommand label
b055535 Header
bd0a10b typo
c39d73c Add comments
f6491af Script to verify EOL sidecar version
4133d1d Merge pull request kubernetes-csi#226 from msau42/cloudbuild
8d519d2 Pin buildkit to v0.10.6 to workaround v0.11 bug with docker manifest
6e04a03 Merge pull request kubernetes-csi#224 from msau42/cloudbuild
26fdfff Update cloudbuild image
6613c39 Merge pull request kubernetes-csi#223 from sunnylovestiramisu/update
0e7ae99 Update k8s image repo url
77e47cc Merge pull request kubernetes-csi#222 from xinydev/fix-dep-version
155854b Fix dep version mismatch
8f83905 Merge pull request kubernetes-csi#221 from sunnylovestiramisu/go-update
1d3f94d Update go version to 1.20 to match k/k v1.27
e322ce5 Merge pull request kubernetes-csi#220 from andyzhangx/fix-golint-error
b74a512 test: fix golint error
901bcb5 Update registry k8s.gcr.io -> registry.k8s.io
aa61bfd Merge pull request kubernetes-csi#218 from xing-yang/update_csi_driver
7563d19 Update CSI_PROW_DRIVER_VERSION to v1.11.0
a2171be Merge pull request kubernetes-csi#216 from msau42/process
cb98782 Merge pull request kubernetes-csi#217 from msau42/owners
a11216e add new reviewers and remove inactive reviewers
dd98675 Add step for checking builds
b66c082 Merge pull request kubernetes-csi#214 from pohly/junit-fixes
b9b6763 filter-junit.go: fix loss of testcases when parsing Ginkgo v2 JUnit
d427783 filter-junit.go: preserve system error log
38e1146 prow.sh: publish individual JUnit files as separate artifacts
78c0fb7 Merge pull request kubernetes-csi#208 from jsafrane/skip-selinux
36e433e Skip SELinux tests in CI by default
348d4a9 Merge pull request kubernetes-csi#207 from RaunakShah/reviewers
1efc272 Merge pull request kubernetes-csi#206 from RaunakShah/update-prow
7d410d8 Changes to csi prow to run e2e tests in sidecars
cfa5a75 Merge pull request kubernetes-csi#203 from humblec/test-vendor
4edd1d8 Add RaunakShah to CSI reviewers group
7ccc959 release tools update to 1.19
d24254f Merge pull request kubernetes-csi#202 from xing-yang/kind_0.14.0
0faa3fc Update to Kind v0.14.0 images
ef4e1b2 Merge pull request kubernetes-csi#201 from xing-yang/add_1.24_image
4ddce25 Add 1.24 Kind image
7fe5149 Merge pull request kubernetes-csi#200 from pohly/bump-kubernetes-version
70915a8 prow.sh: update snapshotter version
31a3f38 Merge pull request kubernetes-csi#199 from pohly/bump-kubernetes-version
7577454 prow.sh: bump Kubernetes to v1.22.0
d29a2e7 Merge pull request kubernetes-csi#198 from pohly/csi-test-5.0.0
41cb70d prow.sh: sanity testing with csi-test v5.0.0
c85a63f Merge pull request kubernetes-csi#197 from pohly/fix-alpha-testing
b86d8e9 support Kubernetes 1.25 + Ginkgo v2
ab0b0a3 Merge pull request kubernetes-csi#192 from andyzhangx/patch-1
7bbab24 Merge pull request kubernetes-csi#196 from humblec/non-alpha
e51ff2c introduce control variable for non alpha feature gate configuration
ca19ef5 Merge pull request kubernetes-csi#195 from pohly/fix-alpha-testing
3948331 fix testing with latest Kubernetes
e4dab7f Merge pull request kubernetes-csi#194 from yselkowitz/registry-k8s-io
84a4d5a Move from k8s.gcr.io to registry.k8s.io
9a0260c fix boilerplate header
37d1104 Merge pull request kubernetes-csi#191 from pohly/go-1.18
db917f5 update to Go 1.18
335339f Merge pull request kubernetes-csi#187 from mauriciopoppe/remove-eol-windows-versions
890b87a Merge pull request kubernetes-csi#188 from pwschuurman/update-release-notes-docs
274bc9b Update Sidecar Release Process documentation to reference latest syntax for release-notes tool
87b6c37 Merge pull request kubernetes-csi#185 from Garima-Negi/fix-OWNERS-files
f1de2c6 Fix OWNERS file - squashed commits
59ae38b Remove EOL windows versions from BUILD_PLATFORMS
5d66471 Merge pull request kubernetes-csi#186 from humblec/sp
d066f1b Correct prow.sh typo and make codespell linter pass
762e22d Merge pull request kubernetes-csi#184 from pohly/image-publishing-troubleshooting
81e26c3 SIDECAR_RELEASE_PROCESS.md: add troubleshooting for image publishing
31aa44d Merge pull request kubernetes-csi#182 from chrishenzie/csi-sanity-version
f49e141 Update csi-sanity test suite to v4.3.0
d9815c2 Merge pull request kubernetes-csi#181 from mauriciopoppe/armv7-support
05c1801 Add support to build arm/v7 images
4aedf35 Merge pull request kubernetes-csi#178 from xing-yang/timeout
2b9897e Increase build timeout
51d3702 Merge pull request kubernetes-csi#177 from mauriciopoppe/kind-image-1.23
a30efea Add kind image for 1.23
a6a1a79 Merge pull request kubernetes-csi#176 from pohly/go-1.17.3
0a2cf63 prow.sh: bump Go to 1.17.3
fc29fdd Merge pull request kubernetes-csi#141 from pohly/prune-replace-optional
5b9a1e0 Merge pull request kubernetes-csi#175 from jimdaga/patch-1
5eeb602 images: use k8s-staging-test-infra/gcb-docker-gcloud
b46691a go-get-kubernetes.sh: make replace statement pruning optional

git-subtree-dir: release-tools
git-subtree-split: 670bb0e
andyzhangx pushed a commit to andyzhangx/external-snapshotter that referenced this pull request Jun 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants