Skip to content

[CUDA] P2P buffer/image memory copy #4401

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 26 commits into from
Closed
Show file tree
Hide file tree
Changes from 8 commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
4a71379
Implemented P2P copies for the cuda backend using buffers.
Aug 12, 2021
c771093
Switched to using vendor name in P2P info query.
Aug 13, 2021
b4a99ff
Merge branch 'intel:sycl' into P2P
JackAKirk Aug 13, 2021
b38d5e2
Merge branch 'intel:sycl' into P2P
JackAKirk Aug 23, 2021
6abe9fb
Corrected the scoped context in guessLocalWorkSize to prevent stale c…
Aug 23, 2021
a3c251e
Added binary device query for P2P memcpy instead of platform query.
JackAKirk Aug 25, 2021
3cd6911
Corrected formatting.
Aug 25, 2021
c384fbe
Corrected Formatting.
Aug 25, 2021
27c073d
Placed new PI API's after piTearDown.
Aug 25, 2021
60f276d
Renamed piextP2P as piextDevicesSupportP2P.
Aug 25, 2021
835e5c4
Made check that devices backends match before P2P query.
Aug 25, 2021
9e3ef85
Corrected formating in graph_builder.cpp.
Aug 25, 2021
0f36b94
Merge branch 'intel:sycl' into P2P
JackAKirk Aug 26, 2021
0d819c0
Replaced binary device query with device_info call returning a vector…
Aug 26, 2021
43f970c
Removed piext Peer functions, replaced them with existing PI copy calls.
Aug 31, 2021
9b9a21f
Fixed formatting issues following previous commit.
Aug 31, 2021
52cae9f
P2P copies made for 1D image arrays again.
Sep 1, 2021
24b240a
Return retError from call to commonEnqueueMemImageNDCopyPeer.
Sep 1, 2021
99e58f1
Removed all changes to memory_manager.cpp:
Sep 2, 2021
fd15910
Superficial change to make the memory_manager.cpp diff empty.
Sep 2, 2021
ccba446
Applied stylistic/general improvements.
Sep 21, 2021
0f95aa7
Merge branch 'sycl' into P2P
JackAKirk Sep 23, 2021
7057849
Reverted change to guessLocalWorkSize: unnecessary since #4606.
Sep 23, 2021
9d5a84a
Merge branch 'intel:sycl' into P2P
JackAKirk Sep 30, 2021
b64484d
Implemented PI_DEVICE_INFO_P2P_READ_DEVICES piDeviceGetInfo case in o…
JackAKirk Oct 1, 2021
ebfdb2a
Merge branch 'intel:sycl' into P2P
JackAKirk Oct 1, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions sycl/include/CL/sycl/detail/pi.def
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ _PI_API(piextDeviceSelectBinary)
_PI_API(piextGetDeviceFunctionPointer)
_PI_API(piextDeviceGetNativeHandle)
_PI_API(piextDeviceCreateWithNativeHandle)
_PI_API(piextP2P)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smaslov-intel I think that note on Line 17 is not enough. We probably want some test, that would fail in a hard way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I missed the note. I can apply its guidance in a new commit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The note is also a little ambiguous for me, should I place new items at the end of the list or at the end of the category to which they fit. I.e. is piextP2P in the right place currently or should it be placed after _PI_API(piTearDown)? Maybe I only have to move the Peer memcpy API's?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JackAKirk all new values should go after piTearDown. The idea is that those macros are expanded into a large enum, which is used in multiple places, including instrumentation APIs, that use values of that enum as keys to what PI call is being executed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK makes sense thanks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

piextP2P is not a very informative name. Maybe something like piextDevicesSupportP2P would be more telling?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. I made the suggested change.

// Context
_PI_API(piContextCreate)
_PI_API(piContextGetInfo)
Expand Down Expand Up @@ -111,11 +112,14 @@ _PI_API(piEnqueueMemBufferReadRect)
_PI_API(piEnqueueMemBufferWrite)
_PI_API(piEnqueueMemBufferWriteRect)
_PI_API(piEnqueueMemBufferCopy)
_PI_API(piextEnqueueMemBufferCopyPeer)
_PI_API(piEnqueueMemBufferCopyRect)
_PI_API(piextEnqueueMemBufferCopyRectPeer)
_PI_API(piEnqueueMemBufferFill)
_PI_API(piEnqueueMemImageRead)
_PI_API(piEnqueueMemImageWrite)
_PI_API(piEnqueueMemImageCopy)
_PI_API(piextEnqueueMemImageCopyPeer)
_PI_API(piEnqueueMemImageFill)
_PI_API(piEnqueueMemBufferMap)
_PI_API(piEnqueueMemUnmap)
Expand Down
27 changes: 27 additions & 0 deletions sycl/include/CL/sycl/detail/pi.h
Original file line number Diff line number Diff line change
Expand Up @@ -1048,6 +1048,19 @@ __SYCL_EXPORT pi_result piQueueGetInfo(pi_queue command_queue,
void *param_value,
size_t *param_value_size_ret);

__SYCL_EXPORT pi_result piextEnqueueMemBufferCopyPeer(
pi_queue src_queue, pi_mem src_buffer, pi_queue dst_queue,
pi_mem dst_buffer, size_t src_offset, size_t dst_offset, size_t size,
pi_uint32 num_events_in_wait_list, const pi_event *event_wait_list,
pi_event *event);

/// p2p is set true if PI API's,
/// piextEnqueueMemBufferCopyPeer/piextEnqueueMemBufferCopyRectPeer/piextEnqueueMemImageCopyPeer,
/// for peer to peer memory copy may be called.
///
__SYCL_EXPORT pi_result piextP2P(pi_device src_device, pi_device dst_device,
bool *p2p);

__SYCL_EXPORT pi_result piQueueRetain(pi_queue command_queue);

__SYCL_EXPORT pi_result piQueueRelease(pi_queue command_queue);
Expand Down Expand Up @@ -1452,6 +1465,14 @@ __SYCL_EXPORT pi_result piEnqueueMemBufferCopyRect(
pi_uint32 num_events_in_wait_list, const pi_event *event_wait_list,
pi_event *event);

__SYCL_EXPORT pi_result piextEnqueueMemBufferCopyRectPeer(
pi_queue src_queue, pi_mem src_buffer, pi_queue dst_queue,
pi_mem dst_buffer, pi_buff_rect_offset src_origin,
pi_buff_rect_offset dst_origin, pi_buff_rect_region region,
size_t src_row_pitch, size_t src_slice_pitch, size_t dst_row_pitch,
size_t dst_slice_pitch, pi_uint32 num_events_in_wait_list,
const pi_event *event_wait_list, pi_event *event);

__SYCL_EXPORT pi_result
piEnqueueMemBufferFill(pi_queue command_queue, pi_mem buffer,
const void *pattern, size_t pattern_size, size_t offset,
Expand All @@ -1477,6 +1498,12 @@ __SYCL_EXPORT pi_result piEnqueueMemImageCopy(
pi_image_region region, pi_uint32 num_events_in_wait_list,
const pi_event *event_wait_list, pi_event *event);

__SYCL_EXPORT pi_result piextEnqueueMemImageCopyPeer(
pi_queue command_queue, pi_mem src_image, pi_queue dst_queue,
pi_mem dst_image, pi_image_offset src_origin, pi_image_offset dst_origin,
pi_image_region region, pi_uint32 num_events_in_wait_list,
const pi_event *event_wait_list, pi_event *event);

__SYCL_EXPORT pi_result
piEnqueueMemImageFill(pi_queue command_queue, pi_mem image,
const void *fill_color, const size_t *origin,
Expand Down
Loading