Skip to content

Commit e5e86f7

Browse files
committed
Aligned spec.md error messages for NodePublishDevice/NodeUnpublishDevice. Clarified some wording
1 parent a603ab3 commit e5e86f7

File tree

1 file changed

+36
-127
lines changed

1 file changed

+36
-127
lines changed

spec.md

+36-127
Original file line numberDiff line numberDiff line change
@@ -1061,9 +1061,9 @@ It is NOT REQUIRED for a controller plugin to implement the `LIST_VOLUMES` capab
10611061
A Node Plugin MUST implement this RPC call if it has `PUBLISH_UNPUBLISH_DEVICE` node capability.
10621062
This RPC is called by the CO when a workload that wants to use the specified volume is placed (scheduled) on a node.
10631063
The Plugin SHALL assume that this RPC will be executed on the node where the volume will be used.
1064-
This RPC MUST be called by the CO once per node, per volume.
1065-
If the corresponding Controller Plugin has `PUBLISH_UNPUBLISH_VOLUME` controller capability and the Node Plugin has `PUBLISH_UNPUBLISH_DEVICE`, then the CO MUST guarantee that this RPC is called after `ControllerPublishVolume` is called for the given volume on the given node and returns a success.
1066-
The CO MUST guarantee that this RPC is called and returns a success before `NodePublishVolume` is called for the given volume on the given node.
1064+
This RPC MUST be called by the CO a maximum of once per node, per volume.
1065+
If the corresponding Controller Plugin has `PUBLISH_UNPUBLISH_VOLUME` controller capability and the Node Plugin has `PUBLISH_UNPUBLISH_DEVICE` capability, then the CO MUST guarantee that this RPC is called after `ControllerPublishVolume` is called for the given volume on the given node and returns a success.
1066+
The CO MUST guarantee that this RPC is called and returns a success before any `NodePublishVolume` is called for the given volume on the given node.
10671067
This operation MUST be idempotent.
10681068
If this RPC failed, or the CO does not know if it failed or not, it MAY choose to call `NodePublishDevice` again, or choose to call `NodeUnpublishDevice`.
10691069

@@ -1094,24 +1094,31 @@ message NodePublishDeviceRequest {
10941094
VolumeCapability volume_capability = 5;
10951095
}
10961096
1097-
message NodePublishDeviceResponse {
1098-
message Result {}
1099-
1100-
// One of the following fields MUST be specified.
1101-
oneof reply {
1102-
Result result = 1;
1103-
Error error = 2;
1104-
}
1105-
}
1097+
message NodePublishDeviceResponse {}
11061098
```
11071099

1100+
#### NodePublishDevice Errors
1101+
1102+
If the plugin is unable to complete the NodePublishDevice call successfully, it MUST return a non-ok gRPC code in the gRPC status.
1103+
If the conditions defined below are encountered, the plugin MUST return the specified gRPC error code.
1104+
The CO MUST implement the specified error recovery behavior when it encounters the gRPC error code.
1105+
1106+
| Condition | gRPC Code | Description | Recovery Behavior |
1107+
|-----------|-----------|-------------|-------------------|
1108+
| Volume does not exist | 5 NOT_FOUND | Indicates that a volume corresponding to the specified `volume_id` does not exist. | Caller MUST verify that the `volume_id` is correct and that the volume is accessible and has not been deleted before retrying with exponential back off. |
1109+
| Volume published but is incompatible | 6 ALREADY_EXISTS | Indicates that a volume corresponding to the specified `volume_id` has already been published at the specified `global_target_path` but is incompatible with the specified `volume_capability` flag. | Caller MUST fix the arguments before retying. |
1110+
| Operation pending for volume | 10 ABORTED | Indicates that there is a already an operation pending for the specified volume. In general the Cluster Orchestrator (CO) is responsible for ensuring that there is no more than one call "in-flight" per volume at a given time. However, in some circumstances, the CO MAY lose state (for example when the CO crashes and restarts), and MAY issue multiple calls simultaneously for the same volume. The Plugin, SHOULD handle this as gracefully as possible, and MAY return this error code to reject secondary calls. | Caller SHOULD ensure that there are no other calls pending for the specified volume, and then retry with exponential back off. |
1111+
| Exceeds capabilities | 10 FAILED_PRECONDITION | Indicates that the CO has exceeded the volume's capabilities because the volume does not have MULTI_NODE capability. | Caller MAY choose to call `ValidateVolumeCapabilities` to validate the volume capabilities, or wait for the volume to be unpublished on the node. |
1112+
11081113
#### `NodeUnpublishDevice`
11091114

11101115
A Node Plugin MUST implement this RPC call if it has `PUBLISH_UNPUBLISH_DEVICE` node capability.
11111116
This RPC is a reverse operation of `NodePublishDevice`.
11121117
This RPC MUST undo the work by the corresponding `NodePublishDevice`.
11131118
This RPC SHALL be called by the CO once for each `global_target_path` that was successfully setup via `NodePublishDevice`.
1114-
If the corresponding Controller Plugin has `PUBLISH_UNPUBLISH_VOLUME` controller capability, the CO SHOULD issue all `NodeUnpublishDevice` (as specified above) before calling `ControllerUnpublishVolume` for the given node and the given volume.
1119+
If the corresponding Controller Plugin has `PUBLISH_UNPUBLISH_VOLUME` controller capability and the Node Plugin has `PUBLISH_UNPUBLISH_DEVICE` capability, the CO MUST guarantee that this RPC is called and returns success before calling `ControllerUnpublishVolume` for the given node and the given volume.
1120+
The CO MUST guarantee that this RPC is called after all `NodeUnpublishVolume` have been called and returned success for the given volume on the given node.
1121+
11151122
The Plugin SHALL assume that this RPC will be executed on the node where the volume is being used.
11161123

11171124
This RPC is typically called by the CO when the workload using the volume is being moved to a different node, or all the workload using the volume on a node has finished.
@@ -1133,17 +1140,20 @@ message NodeUnpublishDeviceRequest {
11331140
string global_target_path = 3;
11341141
}
11351142
1136-
message NodeUnpublishDeviceResponse {
1137-
message Result {}
1138-
1139-
// One of the following fields MUST be specified.
1140-
oneof reply {
1141-
Result result = 1;
1142-
Error error = 2;
1143-
}
1144-
}
1143+
message NodeUnpublishDeviceResponse {}
11451144
```
11461145

1146+
#### NodeUnpublishDevice Errors
1147+
1148+
If the plugin is unable to complete the NodeUnpublishDevice call successfully, it MUST return a non-ok gRPC code in the gRPC status.
1149+
If the conditions defined below are encountered, the plugin MUST return the specified gRPC error code.
1150+
The CO MUST implement the specified error recovery behavior when it encounters the gRPC error code.
1151+
1152+
| Condition | gRPC Code | Description | Recovery Behavior |
1153+
|-----------|-----------|-------------|-------------------|
1154+
| Volume does not exists | 5 NOT_FOUND | Indicates that a volume corresponding to the specified `volume_id` does not exist. | Caller MUST verify that the `volume_id` is correct and that the volume is accessible and has not been deleted before retrying with exponential back off. |
1155+
| Operation pending for volume | 10 ABORTED | Indicates that there is a already an operation pending for the specified volume. In general the Cluster Orchestrator (CO) is responsible for ensuring that there is no more than one call "in-flight" per volume at a given time. However, in some circumstances, the CO MAY lose state (for example when the CO crashes and restarts), and MAY issue multiple calls simultaneously for the same volume. The Plugin, SHOULD handle this as gracefully as possible, and MAY return this error code to reject secondary calls. | Caller SHOULD ensure that there are no other calls pending for the specified volume, and then retry with exponential back off. |
1156+
11471157
#### `NodePublishVolume`
11481158

11491159
This RPC is called by the CO when a workload that wants to use the specified volume is placed (scheduled) on a node.
@@ -1184,6 +1194,8 @@ message NodePublishVolumeRequest {
11841194
// The path to which the device was mounted by `NodePublishDevice`.
11851195
// It MUST be an absolute path in the root filesystem of the process
11861196
// serving this request.
1197+
// It MUST be set if the Node Plugin implements the
1198+
// `PUBLISH_UNPUBLISH_DEVICE` node capability.
11871199
// This is an OPTIONAL field.
11881200
string global_target_path = 4;
11891201
@@ -1264,6 +1276,8 @@ message NodeUnpublishVolumeRequest {
12641276
// The path to which the device was mounted by `NodePublishDevice`.
12651277
// It MUST be an absolute path in the root filesystem of the process
12661278
// serving this request.
1279+
// It MUST be set if the Node Plugin implements the
1280+
// `PUBLISH_UNPUBLISH_DEVICE` node capability.
12671281
// This is an OPTIONAL field.
12681282
string global_target_path = 3;
12691283
@@ -1400,111 +1414,6 @@ message NodeServiceCapability {
14001414
##### NodeGetCapabilities Errors
14011415

14021416
If the plugin is unable to complete the NodeGetCapabilities call successfully, it MUST return a non-ok gRPC code in the gRPC status.
1403-
string error_description = 2;
1404-
}
1405-
1406-
// `NodePublishDevice` specific error.
1407-
message NodePublishDeviceError {
1408-
enum NodePublishDeviceErrorCode {
1409-
// Default value for backwards compatibility. SHOULD NOT be
1410-
// returned by Plugins. However, if a Plugin returns a
1411-
// `NodePublishDeviceErrorCode` code that an older CSI
1412-
// client is not aware of, the client will see this code (the
1413-
// default fallback).
1414-
//
1415-
// Recovery behavior: Caller SHOULD consider updating CSI client
1416-
// to match Plugin CSI version.
1417-
UNKNOWN = 0;
1418-
1419-
// Indicates that there is a already an operation pending for the
1420-
// specified volume. In general the Cluster Orchestrator (CO) is
1421-
// responsible for ensuring that there is no more than one call
1422-
// “in-flight” per volume at a given time. However, in some
1423-
// circumstances, the CO MAY lose state (for example when the CO
1424-
// crashes and restarts), and MAY issue multiple calls
1425-
// simultaneously for the same volume. The Plugin, SHOULD handle
1426-
// this as gracefully as possible, and MAY return this error code
1427-
// to reject secondary calls.
1428-
//
1429-
// Recovery behavior: Caller SHOULD ensure that there are no other
1430-
// calls pending for the specified volume, and then retry with
1431-
// exponential back off.
1432-
OPERATION_PENDING_FOR_VOLUME = 1;
1433-
1434-
// Indicates that a volume corresponding to the specified
1435-
// volume ID does not exist.
1436-
//
1437-
// Recovery behavior: Caller SHOULD verify that the volume ID
1438-
// is correct and that the volume is accessible and has not been
1439-
// deleted before retrying with exponential back off.
1440-
VOLUME_DOES_NOT_EXIST = 2;
1441-
1442-
UNSUPPORTED_MOUNT_FLAGS = 3;
1443-
UNSUPPORTED_VOLUME_TYPE = 4;
1444-
UNSUPPORTED_FS_TYPE = 5;
1445-
MOUNT_ERROR = 6;
1446-
1447-
// Indicates that the specified volume ID is not allowed or
1448-
// understood by the Plugin. More human-readable information MAY
1449-
// be provided in the `error_description` field.
1450-
//
1451-
// Recovery behavior: Caller MUST fix the volume ID before
1452-
// retrying.
1453-
INVALID_VOLUME_ID = 7;
1454-
}
1455-
1456-
NodePublishDeviceErrorCode error_code = 1;
1457-
string error_description = 2;
1458-
}
1459-
1460-
// `NodeUnpublishDevice` specific error.
1461-
message NodeUnpublishDeviceError {
1462-
enum NodeUnpublishDeviceErrorCode {
1463-
// Default value for backwards compatibility. SHOULD NOT be
1464-
// returned by Plugins. However, if a Plugin returns a
1465-
// `NodeUnpublishDeviceErrorCode` code that an older CSI
1466-
// client is not aware of, the client will see this code (the
1467-
// default fallback).
1468-
//
1469-
// Recovery behavior: Caller SHOULD consider updating CSI client
1470-
// to match Plugin CSI version.
1471-
UNKNOWN = 0;
1472-
1473-
// Indicates that there is a already an operation pending for the
1474-
// specified volume. In general the Cluster Orchestrator (CO) is
1475-
// responsible for ensuring that there is no more than one call
1476-
// “in-flight” per volume at a given time. However, in some
1477-
// circumstances, the CO MAY lose state (for example when the CO
1478-
// crashes and restarts), and MAY issue multiple calls
1479-
// simultaneously for the same volume. The Plugin, SHOULD handle
1480-
// this as gracefully as possible, and MAY return this error code
1481-
// to reject secondary calls.
1482-
//
1483-
// Recovery behavior: Caller SHOULD ensure that there are no other
1484-
// calls pending for the specified volume, and then retry with
1485-
// exponential back off.
1486-
OPERATION_PENDING_FOR_VOLUME = 1;
1487-
1488-
// Indicates that a volume corresponding to the specified
1489-
// volume ID does not exist.
1490-
//
1491-
// Recovery behavior: Caller SHOULD verify that the volume ID
1492-
// is correct and that the volume is accessible and has not been
1493-
// deleted before retrying with exponential back off.
1494-
VOLUME_DOES_NOT_EXIST = 2;
1495-
1496-
UNMOUNT_ERROR = 3;
1497-
1498-
// Indicates that the specified volume ID is not allowed or
1499-
// understood by the Plugin. More human-readable information MAY
1500-
// be provided in the `error_description` field.
1501-
//
1502-
// Recovery behavior: Caller MUST fix the volume ID before
1503-
// retrying.
1504-
INVALID_VOLUME_ID = 4;
1505-
}
1506-
1507-
NodeUnpublishDeviceErrorCode error_code = 1;
15081417

15091418
## Protocol
15101419

0 commit comments

Comments
 (0)