Skip to content

Cinder volumes attacher uses NodeName instead of instanceID #39978

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
DukeXar opened this issue Jan 16, 2017 · 4 comments · Fixed by #39998
Closed

Cinder volumes attacher uses NodeName instead of instanceID #39978

DukeXar opened this issue Jan 16, 2017 · 4 comments · Fixed by #39998
Labels
area/provider/openstack Issues or PRs related to openstack provider kind/bug Categorizes issue or PR as related to a bug. sig/storage Categorizes an issue or PR as relevant to SIG Storage.

Comments

@DukeXar
Copy link
Contributor

DukeXar commented Jan 16, 2017

Is this a request for help?: No

What keywords did you search in Kubernetes issues before filing this one?: openstack cinder volume instanceid nodename


Is this a BUG REPORT or FEATURE REQUEST? (choose one): BUG REPORT

Kubernetes version (use kubectl version): v1.5.1

Environment:

What happened: The Cinder volume was not remounted to other node after POD was rescheduled.

What you expected to happen: Cinder volume is detached from previous node mounted on a new one.

How to reproduce it (as minimally and precisely as possible):

  1. Create a PVC using Cinder;
  2. Create a POD under replication controller with one replica using that PVC;
  3. Wait for POD to be scheduled and started;
  4. Cordon the node where POD is running;
  5. Delete the POD;
  6. POD is rescheduled to another node, but volume is not attached - it keeps attaching and detaching it to the new node;

Anything else do we need to know:

Logs showing that attachment succeeds, but when attacher performs rechecking during reconciliation cycle, it incorrectly decides that volume is not attached any longer:

I0113 23:45:20.854316       7 reconciler.go:202] Started AttachVolume for volume "kubernetes.io/cinder/7c0ef8d0-fa65-41b5-b634-098305579935" to node "ak8s-worker-1"
I0113 23:45:22.116749       7 openstack_volumes.go:126] 7c0ef8d0-fa65-41b5-b634-098305579935 kubernetes-dynamic-pvc-6635944c-d431-11e6-a6be-fa163e24f3a8 [map[server_id:b6e0e07c-d64b-45bc-b09d-f8784fa4fbf7 attachment_id:fc236602-3f75-4a6e-beb7-12eeed453cca host_name:<nil> volume_id:7c0ef8d0-fa65-41b5-b634-098305579935 device:/dev/vde id:7c0ef8d0-fa65-41b5-b634-098305579935]]
I0113 23:45:22.116793       7 attacher.go:92] Attach operation is successful. volume "7c0ef8d0-fa65-41b5-b634-098305579935" is already attached to instance "b6e0e07c-d64b-45bc-b09d-f8784fa4fbf7".
I0113 23:45:22.603837       7 openstack_volumes.go:126] 7c0ef8d0-fa65-41b5-b634-098305579935 kubernetes-dynamic-pvc-6635944c-d431-11e6-a6be-fa163e24f3a8 [map[attachment_id:fc236602-3f75-4a6e-beb7-12eeed453cca host_name:<nil> volume_id:7c0ef8d0-fa65-41b5-b634-098305579935 device:/dev/vde id:7c0ef8d0-fa65-41b5-b634-098305579935 server_id:b6e0e07c-d64b-45bc-b09d-f8784fa4fbf7]]
I0113 23:45:22.603879       7 operation_generator.go:217] AttachVolume.Attach succeeded for volume "kubernetes.io/cinder/7c0ef8d0-fa65-41b5-b634-098305579935" (spec.Name: "pvc-6635944c-d431-11e6-a6be-fa163e24f3a8") from node "ak8s-worker-1".
I0113 23:45:22.603967       7 actual_state_of_world.go:447] Report volume "kubernetes.io/cinder/7c0ef8d0-fa65-41b5-b634-098305579935" as attached to node "ak8s-worker-1"
I0113 23:45:22.682490       7 node_status_updater.go:136] Updating status for node "ak8s-worker-1" succeeded. patchBytes: "{}" VolumesAttached: [{kubernetes.io/cinder/7c0ef8d0-fa65-41b5-b634-098305579935 /dev/vde}]
I0113 23:45:23.078355       7 actual_state_of_world.go:343] SetVolumeMountedByNode volume kubernetes.io/cinder/7c0ef8d0-fa65-41b5-b634-098305579935 to the node "ak8s-worker-1" mounted true
I0113 23:45:23.767648       7 openstack_volumes.go:126] 7c0ef8d0-fa65-41b5-b634-098305579935 kubernetes-dynamic-pvc-6635944c-d431-11e6-a6be-fa163e24f3a8 [map[host_name:<nil> volume_id:7c0ef8d0-fa65-41b5-b634-098305579935 device:/dev/vde id:7c0ef8d0-fa65-41b5-b634-098305579935 server_id:b6e0e07c-d64b-45bc-b09d-f8784fa4fbf7 attachment_id:fc236602-3f75-4a6e-beb7-12eeed453cca]]
I0113 23:45:23.767768       7 attacher.go:140] VolumesAreAttached: check volume "7c0ef8d0-fa65-41b5-b634-098305579935" (specName: "pvc-6635944c-d431-11e6-a6be-fa163e24f3a8") is no longer attached
I0113 23:45:23.768069       7 operation_generator.go:162] VerifyVolumesAreAttached determined volume "kubernetes.io/cinder/7c0ef8d0-fa65-41b5-b634-098305579935" (spec.Name: "pvc-6635944c-d431-11e6-a6be-fa163e24f3a8") is no longer attached to node "ak8s-worker-1", therefore it was marked as detached.
I0113 23:45:23.787455       7 reconciler.go:202] Started AttachVolume for volume "kubernetes.io/cinder/7c0ef8d0-fa65-41b5-b634-098305579935" to node "ak8s-worker-1"

Later analysis showed that Cinder attacher uses NodeName, instead of instanceID, compare e.g. with Attach operation and DisksAreAttached

@resouer
Copy link
Contributor

resouer commented Jan 17, 2017

It seems to be a bug, would you like to send a PR to fix it please?

@resouer resouer added kind/bug Categorizes issue or PR as related to a bug. area/provider/openstack Issues or PRs related to openstack provider sig/storage Categorizes an issue or PR as relevant to SIG Storage. labels Jan 17, 2017
@DukeXar
Copy link
Contributor Author

DukeXar commented Jan 17, 2017

@resouer done

What do you think about test coverage of that thing?

Also would not it be better to have instance ids cached in the node structure, so that they are not being looked up every reconcile cycle?

Edit: Seems there is #31646 and #31321 to address this partially.

@resouer
Copy link
Contributor

resouer commented Jan 17, 2017

I think we will focus on this issue and leave node structure refactoring to further issue.

@xsgordon
Copy link

/cc @k8s-sig-openstack-bugs

k8s-github-robot pushed a commit that referenced this issue Feb 10, 2017
Automatic merge from submit-queue (batch tested with PRs 41246, 39998)

Cinder volume attacher: use instanceID instead of NodeID when verifying attachment

**What this PR does / why we need it**: Cinder volume attacher incorrectly uses NodeID instead of openstack instance id, so that reconciliation fails.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #39978 

**Special notes for your reviewer**:

**Release note**:

```release-note
```
dims pushed a commit to dims/kubernetes that referenced this issue Feb 8, 2018
Automatic merge from submit-queue (batch tested with PRs 41246, 39998)

Cinder volume attacher: use instanceID instead of NodeID when verifying attachment

**What this PR does / why we need it**: Cinder volume attacher incorrectly uses NodeID instead of openstack instance id, so that reconciliation fails.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes kubernetes#39978 

**Special notes for your reviewer**:

**Release note**:

```release-note
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/provider/openstack Issues or PRs related to openstack provider kind/bug Categorizes issue or PR as related to a bug. sig/storage Categorizes an issue or PR as relevant to SIG Storage.
Projects
None yet
3 participants