Skip to content

GetNodeId is burdensome #141

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
saad-ali opened this issue Nov 8, 2017 · 18 comments
Closed

GetNodeId is burdensome #141

saad-ali opened this issue Nov 8, 2017 · 18 comments

Comments

@saad-ali
Copy link
Member

saad-ali commented Nov 8, 2017

Arguably for many SPs the NodeID will be the same as the CO's nodeID, and forcing them to figure out their own nodeID via the GetNodeID call is burdensome. Can we make life easier for such plugins?

We should consider passing the CO node ID in the GetNodeID request and if the SP doesn't have a custom node ID it can just return that instead of doing extra work to figure out what the CO thinks the node ID should be.

@akutz
Copy link
Contributor

akutz commented Nov 8, 2017

Except how does an SP know the provided node ID is the same as the one returned by GetNodeID? Or how did the CO get it in the first place since the node ID is specific to the SP and opaque to the CO.

In my experience implementing SPs, the node IDs for each are disparate and unique to the SP. I’m just not sure how the CO is getting the node IDs in the first place when that process is in part why SPs are necessary.

@saad-ali
Copy link
Member Author

saad-ali commented Nov 8, 2017

Except how does an SP know the provided node ID is the same as the one returned by GetNodeID?

It would be up to the SP to determine if it wants a custom node ID or not. If it wants a custom node ID, it would completely disregard the node ID passed in by the CO and return the node ID it wants the CO to use. If it can use the same node ID as the CO, instead of figuring out what that ID is, it would simply return the same node ID passed in by the CO.

Or how did the CO get it in the first place since the node ID is specific to the SP and opaque to the CO.

The ID passed in would be the CO node ID NOT specific to any SP.

In my experience implementing SPs, the node IDs for each are disparate and unique to the SP. I’m just not sure how the CO is getting the node IDs in the first place when that process is in part why SPs are necessary.

The CO has it's own way it identifies nodes. This is simply saying that that should be passed as input for the SP to optionally use.

@edisonxiang
Copy link
Contributor

Hi saad, this is a very good question. I took a look at the old design about GetNodeID.
I think it's better to call it 'GetNodeInfo' not 'GetNodeID'.
When I implement 'ControllerPublishVolume' interface for CSI-ISCSI Plugin, I expect to get the NodeInfo from CO like below:

        "node_info" : {
                "id" :   "123456"
                "ip" : "192.168.1.100",
                "hostname" : "testhost",
                "iqn" : "iqn.1994-05.com.redhat:e07473c46c5e",
            }
        }

The NodeInfo like 'iqn' will be used to init ISCSI Target on the Storage Server. The NodeInfo indicates which node could connect target on the Storage Server, and then 'NodePublishVolume' could do the work to connect the Storage Servers. So I proposal to change 'GetNodeID' into 'GetNodeInfo' like below:

message GetNodeInfoRequest {
  Version version = 1;
}
message GetNodeInfoResponse {
  message Result {
    map<string, string> node_info = 1;
  }
  // One of the following fields MUST be specified.
  oneof reply {
    Result result = 1;
    Error error = 2;
  }
}

If 'GetNodeID' only returns the NodeID, it is ok for the OnCloud scenario like GCE or OpenStack and so on. In the OnCloud scenario, it just needs to launch attach operation only by using 'NodeName' and 'DiskName'. GCE Example. But in the OffCloud scenario, 'ControllerPublishVolume' need more information about the Node like ISCSI example. So what do you think about that? Thank you.

@akutz
Copy link
Contributor

akutz commented Nov 8, 2017

I just don’t see the utility of this change. It assumes the CO has knowledge that simply cannot be assumed and breaks the relationship between the CO and SP. I wholeheartedly disagree with this proposal.

@cpuguy83
Copy link
Contributor

cpuguy83 commented Nov 8, 2017

@akutz

It assumes the CO has knowledge that simply cannot be assumed

Can you explain what you mean by this?

I do think it's a little strange to pass in a node ID into GetNodeID. The node plugin could return an empty ID if it doesn't implement a custom node ID... or we could make the call optional.

@akutz
Copy link
Contributor

akutz commented Nov 8, 2017

It assumes the CO has knowledge that simply cannot be assumed

Can you explain what you mean by this?

Hi @cpuguy83,

Sure. The node ID is a piece of information that is specific to the SP. For example, on any host one could argue that the node ID could be the host name. Maybe that is what the CO elects to use as the CO's node ID. However, this is not the node ID for an EBS host, a ScaleIO host, or many other hosts connected to storage platforms. And how does a CO know what the node ID is for one of those hosts as it relates to the storage platform? Simply put, it doesn't.

This is where @saad-ali's proposed workflow does not make sense to me. He says an SP can check to see if the provided node ID is valid, but how does it do that? With a regex? That's not a reliable or robust solution IMO. What likely has to happen is the SP must get the node ID anyway and compare it to the one provided by the CO. Which means the SP still got the node ID.

a custom node ID

Most of the storage platforms with which I've worked have custom node IDs:

Platform Node ID Type
Microsoft Azure Unmanaged Disk Host Name
Cinder OpenStack Remote Metadata
Digital Ocean Block Storage Remote Metadata
EBS Remote Metadata
EFS Remote Metadata
FittedCloud Remote Metadata
Google Cloud Persistent Disk Host Name + Project Information
Isilon List of Local IP Addresses
S3FS Host Name
ScaleIO Kernel Metadata (GUID)
VirtualBox List of Local MAC Addresses

The above links demonstrate working examples from real storage platforms that show the disparate nature of what amounts to the CSI node ID.

A node ID is specific to a storage platform, and a CO is simply unlikely to have knowledge of the node ID. That's the entire point of the SP -- to provide an opaque means of interacting with several, disparate storage platforms. This proposal is in direct contradiction to that.

@akutz
Copy link
Contributor

akutz commented Nov 8, 2017

Also, this does not make sense to me:

GetNodeID is burdensome

How so? It is called once on a node host by the CO. The Node ID is supposed to be static and cacheable by the CO, so I do not see how getting the node ID once per SP is burdensome.

@cpuguy83
Copy link
Contributor

cpuguy83 commented Nov 8, 2017

I do not think it's about checking if a node ID is valid but rather giving the node a default node ID for which it can respond with if the plugin doesn't care about node ID.

@akutz
Copy link
Contributor

akutz commented Nov 8, 2017

Hi @cpuguy83,

My points are:

  1. Based on my experience most SPs will care about the node ID.
  2. An SP will have to get its own node ID to determine if the one provided is valid.
  3. This is not an expensive or burdensome piece of the workflow as it should only be called once per SP per node.

@codenrhoden
Copy link

I would take a different tack on this... The spec calls out the following for the GetNodeID response:

message GetNodeIDResponse {
  message Result {
    // The ID of the node which SHALL be used by CO in
    // `ControllerPublishVolume`. This is an OPTIONAL field. If unset,
    // the CO SHALL leave the `node_id` field unset in
    // `ControllerPublishVolume`.
    string node_id = 1;
  }

The node node_id is optional. If the SP doesn't need a special/custom ID, it doesn't return one. So in my opinion, this case is already covered. The SPs are never "forced" to figure out their node ID. They either need one to perform their duties, or they don't. If they do need one, they should provide it. I wouldn't call it burdensome.

@akutz
Copy link
Contributor

akutz commented Nov 8, 2017

Hi @codenrhoden,

I was literally about to suggest the same thing. Why pass in a default value at all? If GetNodeID is not implemented then simply don't call it, or if you call it and an empty response is provided, ignore it.

Totally agreed with you Travis.

@akutz
Copy link
Contributor

akutz commented Nov 8, 2017

Hi @saad-ali,

It's odd that the result of GetNodeId is optional instead of the RPC, since any other RPC in the spec is the optional component, not its result.

This feels to me like the point of the Capabilities. I'd introduce a new Node Capability NODEID. If an SP advertises the Node Capability NODEID then a CO should invoke GetNodeId and use the result for the controller's PUBLISH_UNPUBLISH_VOLUME RPCs.

@jieyu
Copy link
Member

jieyu commented Nov 9, 2017

@akutz based on the discussion in #121, GetNodeId RPC is tied to Controller's PUBLISH_UNPUBLISH_VOLUME capability. I am in favor of doing that, instead of introducing another Node Capability.

@saad-ali
Copy link
Member Author

The node node_id is optional. If the SP doesn't need a special/custom ID, it doesn't return one

Problem with that is in ControllerUnpublishVolumeRequest a lack of node_id is used to indicate that the volume should be detached from any and all nodes it is attached to (i.e. a way for the caller to be able to say, I don't care where it is attached, just detach it).

GetNodeId RPC is tied to Controller's PUBLISH_UNPUBLISH_VOLUME capability

This is not sufficient to resolve the original issue. If I am writing an SP that can understand and use the CO's node_id, I shouldn't have to do a ton of work to discover what that node_id is.

@akutz
Copy link
Contributor

akutz commented Nov 17, 2017

If I am writing an SP that can understand and use the CO's node_id

This will literally never happen unless the SP simply doesn't have a node ID and elects to use the CO's provided string. It is not ever the business of the CO to presume to know what the SP uses as the node ID.

Because the only valid case for this proposal is when the SP doesn't provide a node ID, I don't see why the proposal isn't the following:

If the SP returns an empty Node ID then the CO is free to provide whatever value it wishes to ControllerPublishVolume.

@saad-ali
Copy link
Member Author

This will literally never happen unless the SP simply doesn't have a node ID and elects to use the CO's provided string

The cases that I'm thinking about are cloud provider SP's like GCE PD or AWS EBS where the names that the CO uses to identify the node are identical to the way the SP identifies nodes.

If the SP returns an empty Node ID then the CO is free to provide whatever value it wishes to ControllerPublishVolume.

I would prefer letting the SP explicitly make the call, rather than putting something in the spec that leaves an input completely undefined and unvalidated. If the CO's node_id is passed to the SP in GetNodeId, the SP can validate that ID, if the id doesn't match the expected format then it can choose to do a heavyweight operation to discover the id, otherwise it can just return that id.

@akutz
Copy link
Contributor

akutz commented Nov 20, 2017

Hi Saad,

But you are letting the SP make that decision. It decides to not return a Node ID with the understanding that the CO will use one of its own.

@saad-ali
Copy link
Member Author

saad-ali commented Nov 2, 2018

I'm ok closing this. We worked around it on k8s side.

@saad-ali saad-ali closed this as completed Nov 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants