Skip to content

Commit 0abdfd6

Browse files
authored
[SYCL][CUDA] Don't infer memcpy src/dst pitches from copied region (intel#3563)
Previously, when passing 0 for row/slice pitches for source and/or destination to `commonEnqueueMemBufferCopyRect`, the function would infer the row/slice pitches from the copied region. This does not work when copying with an offset, as `cuMemcpy3DAsync` checks whether the pitches are at least as large as the offset + copied range, otherwise returning CUDA_ERROR_INVALID_VALUE. For 1D / 2D copies, no / not all pitches are required, and can instead be specified as 0. The mechanism is thus removed and replaced with assertions to check that for 2D and 3D copies, non-zero pitches are provided. In some situations, a zero row/slice pitch is passed for source and/or destination to `commonEnqueueMemBufferCopyRect`. In this case, the pitches are inferred from the copied region. However, if source and/or destination have offsets in the x or y dimension, those must be taken into account as well (CUDA checks this, and will otherwise return `CUDA_ERROR_INVALID_VALUE`).
1 parent bf7eb4c commit 0abdfd6

File tree

1 file changed

+12
-6
lines changed

1 file changed

+12
-6
lines changed

sycl/plugins/cuda/pi_cuda.cpp

Lines changed: 12 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3654,12 +3654,18 @@ static pi_result commonEnqueueMemBufferCopyRect(
36543654
assert(src_type == CU_MEMORYTYPE_DEVICE || src_type == CU_MEMORYTYPE_HOST);
36553655
assert(dst_type == CU_MEMORYTYPE_DEVICE || dst_type == CU_MEMORYTYPE_HOST);
36563656

3657-
src_row_pitch = (!src_row_pitch) ? region->width_bytes : src_row_pitch;
3658-
src_slice_pitch = (!src_slice_pitch) ? (region->height_scalar * src_row_pitch)
3659-
: src_slice_pitch;
3660-
dst_row_pitch = (!dst_row_pitch) ? region->width_bytes : dst_row_pitch;
3661-
dst_slice_pitch = (!dst_slice_pitch) ? (region->height_scalar * dst_row_pitch)
3662-
: dst_slice_pitch;
3657+
src_row_pitch = (!src_row_pitch) ? region->width_bytes + src_offset->x_bytes
3658+
: src_row_pitch;
3659+
src_slice_pitch =
3660+
(!src_slice_pitch)
3661+
? ((region->height_scalar + src_offset->y_scalar) * src_row_pitch)
3662+
: src_slice_pitch;
3663+
dst_row_pitch = (!dst_row_pitch) ? region->width_bytes + dst_offset->x_bytes
3664+
: dst_row_pitch;
3665+
dst_slice_pitch =
3666+
(!dst_slice_pitch)
3667+
? ((region->height_scalar + dst_offset->y_scalar) * dst_row_pitch)
3668+
: dst_slice_pitch;
36633669

36643670
CUDA_MEMCPY3D params = {};
36653671

0 commit comments

Comments
 (0)