-
Notifications
You must be signed in to change notification settings - Fork 770
PI object created from native handles take ownership #1516
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I think it is important that the PI objects created with the |
My primary concern is that this would make a difference in how native objects are handled between the OpenCL BE and the CUDA BE. For the sake of argument, assume we do as you suggest and have +1 reference count on the created object. Say now that we have the Now we imagine a very simple use case where a user creates a SYCL object from a native handle, uses the SYCL object, and wants to clean up after it. Using the CUDA backend this would be done as:
However, for the OpenCL BE it would require something along the lines of:
This is the primary reason for the two options presented originally, specifically;
Since we've elected to implement the latter we need to make sure that step 4. of the OpenCL example above need not be done, as it would have taken ownership. Of course, if the user had done their own retains they should also do the same amount of releases, but they shouldn't have to do a release of the object from the creation of it. What you suggest is probably more in line with option 1. however, in which case we would have to prevent the CUDA backend from destroying CUDA objects used in a |
Ping @smaslov-intel. @romanovvlad may also have some input regarding this? |
The retain is called for OpenCL BE and so there needs to be step 4 for the release, right?
Can we release the OpenCL native handle as many times as needed for its reference counter go to 0 at the destruction of the PI object owning that native handle? |
For the OpenCL interoperabilty constructor that takes OpenCL objects that is the case, but those constructors cannot be used with the CUDA backend. That is why the
In principle we could probably do this as the reference count is exposed through the info queries, though the specification, in this case taken from
Additionally, which is likely part of why the previous quote is in the specification, we could have a case where the user have passed the OpenCL object to an OpenCL operation that retained the object, and while the operation is running the user creates a SYCL object from the object, using the future However, my reasoning behind it being enough to just not retain the OpenCL object when creating a SYCL object is that the SYCL object gets the responsibility of the single reference that was made upon creation of the OpenCL object. Effectively this means that a user that has called |
Trying to understand the topic...
Probably it would be better to define a common behavior that can be supported by all backends: [UPD] release function is still needed for CUDA BE in current implementation. |
The only thing I see in the proposal is regarding the
This is what we currently have, that is the SYCL object takes some ownership of the native object. This effectively means that the users will only have to release a native object if they have retained it explicitly. For OpenCL this means that the SYCL object simply doesn't retain the native object upon creation, so it takes responsibility of releasing the one reference from creation, but any additional retains (either by the user or from e.g. library calls) are not the responsibility of the SYCL object. Since CUDA doesn't have reference counting on native objects the user cannot retain, so if a SYCL object is created from a native object it will also destroy it. I believe this makes the ownership issue somewhat more consistent between the backends.
In the CUDA backend the retain and release is done on the PI object. Unlike with the OpenCL backend, where the PI objects can share the reference counts with the native objects, CUDA objects are not reference counted. Either way, the user should not have access to any sort of retain or release in the CUDA BE. |
@smaslov-intel recently pointed me to this issue and asked me to comment. I'm a member of the SYCL language committee, and I think I have some understanding of the backend generalization proposal as it is specified in SYCL 2020 provisional. The SYC spec defines a set of common APIs for interoperating between SYCL objects and native backend objects. However, the exact semantics of these APIs can be different for each backend. In particular, the lifetime guarantees about the native objects encapsulated by a SYCL object are not specified in SYCL 2020. Instead, each backend should have its own "backend specification document" which documents these details. For example, a SYCL application can create a SYCL object from a native object via one of the Likewise, a SYCL application can get an underlying native object from a SYCL object by calling one of the All mention of reference counting of native objects has been removed from the SYCL 2020 spec. Therefore, it is not necessary for the underlying native objects to even support the concept of reference counting. The OpenCL "backend specification" (which is an appendix in SYCL 2020) does rely on reference counting of native objects, but this is specific to the OpenCL backend. Other backends need not do this. For OpenCL, Other backends need to decide their own lifetime rules. I'm not that familiar with CUDA, but I know that Level 0 does not support reference counted objects. (From the conversation above, it sounds like CUDA also does not support reference counted objects.) Therefore, it is not possible for the Level 0 version of
Personally, I think the rules for (2) might get confusing, especially when one SYCL object contains another SYCL object. Therefore, (1) might be less confusing for customers. Another possibility would be to provide a parameter that lets the user choose between these two behaviors. For example, the Level 0 version of |
Thank you for your insight, @gmlueck ! My biggest concern here is how transparent the necessary lifetime of the native handle is, because if the user keeps sole ownership of a native handle they must make sure that it's not used by any SYCL object still alive when they destroy the resource. Expressing this to the user may become problematic because they only know of the reference that they're given to the SYCL object they created with the native handle, but even after that is gone there may be other SYCL objects keeping that exact SYCL object alive. Tracking that lifetime without having a view of the underlying runtime is not trivial and can easily lead to very confusing errors that will be extremely difficult to debug. On the other hand, relinquishing all user ownership of the native object when creating a SYCL object means that the user will not have to worry about the native handle anymore and all they have to do is make sure that the created SYCL object stays alive for the time that they need the native handle. Granted there are problems with both and I can imagine there are use-cases speaking for/against one or the other. However, I believe you are right in that SYCL 2020 makes this backend-specific, so maybe it's time to do let OpenCL be OpenCL and let the internal reference count do its job.
That may not be a bad solution to this. Maybe have something similar to |
I agree with you here, which is why I suggest option (1) in my list above.
Adding a property list is not a bad idea. Another option is to just remove the interop API prototypes from the SYCL 2020 spec and require each backend to define these. I'm not sure how useful it is to require each backend to have the same API if each backend will have a different semantic. Regarding the L0 backend specifically, do others think it is useful to provide a parameter of some sort to choose between my options (1) and (2)? If there are no use cases for option (2), there's no sense in defining a parameter. |
Ah, apologies. I somehow convinced myself it was referring to (1) in the issue. Then I think we're on the same page. |
Since this issue was created I already added some of I believe what's implemented matches the agreement between @steffenlarsen and @gmlueck that the SYCL object takes ownership (OpenCL plugin does not retain native objects) and that @smaslov-intel wanted that no explicit call to |
The discussion about calling I skimmed through the L0 code, and I didn't see any problem except that L0 never calls zeModuleDestroy() for any SYCL I don't think the OpenCL code is correct. According to the SYCL 2020 OpenCL backend specification, creating a SYCL object from a native object should increment the reference count on the underlying OpenCL object (i.e. call
However, I don't see where that is happening. For example, this is from "sycl/plugins/opencl/pi_opencl.cpp":
Is |
Right, this is a known issue, and there is a TODO asking to call the
This is talking about constructors not the |
I agree that the wording should be clearer, but I believe this is talking about the
The L0 and the OpenCL backends have different semantics for |
Why can't we go with "takes ownership" universally, i.e. remove the requirement to retain native OpenCL object from SYCL 2020 ? |
I have two answers:
|
At the time of implementing the interoperability operations, focus was on trying to make the PI interoperability functions as consistent between backends as possible. For That said, I agree with @gmlueck that given the freedom offered by it being backend-specific behaviour I think the PI API should allow the same freedom. After all, it is interop so users are likely only using one of the APIs anyway, and if not they should understand the differences between them. |
This can be closed as we had settled with a solution, which is different for different backends.
I am not aware of any CUDA backend specification, and one should probably be created. |
#1332 introduces a number of PI API functions for creating various PI objects from native objects (
piext*CreateWithNativeHandle
.) However, given that the CUDA backend is unable to share a reference count between the native object and the native object, like the OpenCL backend does, two solutions were proposed:The user keeps ownership of the native object, which would require the CUDA backend to have a special case for PI objects that would not destroy the native object upon its own destruction. This would require the user to keep the native object alive while using the corresponding PI objects.
The created PI object takes ownership of the native object, which doesn't require change to neither of the backends, but can be implemented by not having these functions retain the native objects in the OpenCL BE. This requires that the user keeps the PI object alive while the native object is alive.
Of these options the second was chosen. With this implementation, in order to adhere to the requirement from the SYCL 1.2.1 specification that interoperability constructor must retain the OpenCL object,
cl*Retain
is called in those constructors.Currently these constructors are the only ones using the
piext*CreateWithNativeHandle
at the moment, though they are expected to be a somewhat directly called by a futuremake
function similar to the newget_native
functions. At that point, the lifetime choices made for native objects withpiext*CreateWithNativeHandle
becomes exposed to the user.The point of this issue is to review these choices and make sure we reach a solution that we are happy with before this is directly exposed to the SYCL users. See discussion on the subject here: #1332 (comment).
The text was updated successfully, but these errors were encountered: