|
| 1 | +# Using Persistent Volumes as Virtual Machine disks |
| 2 | + |
| 3 | +Author: Fabian Deutsch \<[email protected]\> |
| 4 | + |
| 5 | +## Introduction |
| 6 | + |
| 7 | +Virtual Machines use to have disks attached. They are not always required, but |
| 8 | +have some value if you need to persist data. |
| 9 | + |
| 10 | +Kubernetes provides persistent storage through Persistent Volumes and |
| 11 | +Claims. |
| 12 | + |
| 13 | +The purpose of this proposal is to describe a mechanism to use Persistent |
| 14 | +Volumes as a backing store for Virtual Machine disks. |
| 15 | + |
| 16 | + |
| 17 | +### Use-case |
| 18 | + |
| 19 | +The primary use-case is to attach regular (writable) disks to Virtual Machines |
| 20 | +which are backed by Peristent Volumes. |
| 21 | + |
| 22 | + |
| 23 | +## API |
| 24 | + |
| 25 | +This section is concerned about how the Persistent Volumes are referenced |
| 26 | +in the `VM` Resource type. |
| 27 | + |
| 28 | +In general the referencing is aligned with how pods are consuming Persistent |
| 29 | +Volume Claims as described [here](https://kubernetes.io/docs/api-reference/v1.5/#persistentvolumeclaimvolumesource-v1) |
| 30 | + |
| 31 | +Today the `VM.spec.domain` reflects much of |
| 32 | +[libvirt's domain xml specification](http://libvirt.org/formatdomain.html#elementsDisks). |
| 33 | +To communicate the new storage type through the API, an additional disk type |
| 34 | +`PersistentVolumeClaim` is accepted. |
| 35 | +In the case of a `PersistentVolumeClaim` type The `disk/source/name` attribute is |
| 36 | +used to name the claim to use. |
| 37 | + |
| 38 | +Example with the following PV and PVC: |
| 39 | + |
| 40 | +```yaml |
| 41 | +# For teh sake of completeness a volumen and claim |
| 42 | +kind: PersistentVolume |
| 43 | +metadata: |
| 44 | + name: pv001 |
| 45 | + labels: |
| 46 | + release: "stable" |
| 47 | +spec: |
| 48 | + capacity: |
| 49 | + storage: 5Gi |
| 50 | + iscsi: |
| 51 | + targetPortal: example.com:3260 |
| 52 | + iqn: iqn.2013-07.com.example:iscsi-nopool/ |
| 53 | + lun: 0 |
| 54 | +--- |
| 55 | +kind: PersistentVolumeClaim |
| 56 | +metadata: |
| 57 | + name: disk-01 |
| 58 | +spec: |
| 59 | + resource: |
| 60 | + requests: |
| 61 | + storage: 4Gi |
| 62 | + selector: |
| 63 | + matchLabels: |
| 64 | +release: "stable" |
| 65 | +``` |
| 66 | +
|
| 67 | +this is used by the Vm in the following way: |
| 68 | +
|
| 69 | +```yaml |
| 70 | +kind: VM |
| 71 | +spec: |
| 72 | + domain: |
| 73 | + devices: |
| 74 | + disks: |
| 75 | + - type: PersistentVolumeClaim |
| 76 | + - source: |
| 77 | + name: disk-01 |
| 78 | + - target: |
| 79 | + bus: scsi |
| 80 | + target: sda |
| 81 | +``` |
| 82 | +
|
| 83 | +
|
| 84 | +## Implementation |
| 85 | +
|
| 86 | +### Flow |
| 87 | +
|
| 88 | +1. User adds an existing `PersistentVolumeClaim` as described above to |
| 89 | + the VM instance. |
| 90 | +2. The VM Pod is getting scheduled on a host, the `virt-handler` |
| 91 | + identifies the Claim, translates it into the corresponding |
| 92 | + libvirt representation and includes it in the domain xml. |
| 93 | + |
| 94 | +**Note**: The `virt-controller` does not do anything with the claim. |
| 95 | + |
| 96 | +**Note**: In this flow, the claim is _not_ used by the pod, the claim |
| 97 | +is only used by the `virt-handler` to identify the connection details |
| 98 | +to the storage. |
| 99 | + |
| 100 | +Because VMs only accept block storage as disks, the handler can only |
| 101 | +accept claims which are backed by block storage types. |
| 102 | + |
| 103 | + |
| 104 | +### `virt-handler` changes |
| 105 | + |
| 106 | +Once a VM is scheduled on a host, the `virt-handler` is transforming the |
| 107 | +VM Spec into a libvirt domain xml. |
| 108 | + |
| 109 | +During this transformation, every disk which is of type `PersistentVolumeClaim` |
| 110 | +needs to be transformed into an adequate libvirt disk type. |
| 111 | + |
| 112 | +To achieve this, the `virt-handler` needs to look at volume attached to a |
| 113 | +claim, to identify the `type` and connection details. |
| 114 | +Note that the claim itself will contain the connection details if it is |
| 115 | +associated with a volume. In the case that the claim is _not_ associated with |
| 116 | +a volume the launch of the VM should fail. |
| 117 | +If the type is either iSCSI or RBD, then it can take the connection details |
| 118 | +and transform them into the correct libvirtd representation. |
| 119 | + |
| 120 | +Example for an iSCSI volume: |
| 121 | + |
| 122 | +```xml |
| 123 | + <disk type='network' device='disk'> |
| 124 | + <driver name='qemu' type='raw'/> |
| 125 | + <source protocol='iscsi' name='iqn.2013-07.com.example:iscsi-nopool/2'> |
| 126 | + <host name='example.com' port='3260'/> |
| 127 | + </source> |
| 128 | + </disk> |
| 129 | +``` |
| 130 | + |
| 131 | +In the snippet above, most of the snippet is hard-coded, only the values |
| 132 | +for `disk/source/name`, `disk/host/name`, and `disk/host/port` are populated |
| 133 | +with the values from the Persistent Volume Claim. |
| 134 | + |
| 135 | +Once the VM starts up, `qemu` will be connecting to the target to connect |
| 136 | +the disk. |
0 commit comments