Skip to content

fix: synchronize resubmission of the same command buffer #18566

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: sycl
Choose a base branch
from

Conversation

EuphoricThinking
Copy link
Contributor

added a test for in-order queues with command buffers

@@ -158,19 +158,14 @@ ur_result_t ur_command_list_manager::appendRegionCopyUnlocked(
}

wait_list_view ur_command_list_manager::getWaitListView(
const ur_event_handle_t *phWaitEvents, uint32_t numWaitEvents,
ur_event_handle_t additionalWaitEvent) {
const ur_event_handle_t *phWaitEvents, uint32_t numWaitEvents) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ur_exp_command_buffer_desc_t desc{UR_STRUCTURE_TYPE_EXP_COMMAND_BUFFER_DESC,
nullptr, false, false, false};
nullptr, false, true /* WAS: false */, false};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why change the buffer to in order?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to extend the time of the buffer execution, but as @pbalcer mentioned, I'll revert this to out-of-order

@@ -112,6 +113,7 @@ struct urCommandBufferExpExecutionTest : uur::urKernelExecutionTest {
}

ur_exp_command_buffer_handle_t cmd_buf_handle = nullptr;
ur_exp_command_buffer_handle_t cmd_buf_in_order = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is it used?

Copy link
Contributor

@pbalcer pbalcer May 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the interest of time, and knowing that right now there's no difference between the two, I suggest not adding in order cmd buf here for now, and just testing out of order for now (both should exhibit the issue we are trying to solve).

@@ -132,3 +154,25 @@ TEST_P(urEnqueueCommandBufferExpTest, SerializeOutofOrderQueue) {
ASSERT_EQ(reference, Output[i]);
}
}

TEST_P(urEnqueueCommandBufferExpTest, SerializeInOrderQueue) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test is nearly identical to SerializeOutofOrderQueue. Parameters are just different. You could use UUR_DEVICE_TEST_SUITE_WITH_PARAM or just create a function with the body of the test. Whatever is simpler (though using the macro is probably more idiomatic).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've used the macro and introduced a new fixture for parametrized tests.

Copy link
Contributor

@pbalcer pbalcer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, just a few nits.

if (executionEvent != nullptr) {
ZE2UR_CALL(zeEventHostSynchronize, (executionEvent->getZeEvent(), UINT64_MAX));
}

UR_CALL(enqueueGenericCommandListsExp(
1, &commandBufferCommandList, phEvent, numEventsInWaitList,
phEventWaitList, UR_COMMAND_ENQUEUE_COMMAND_BUFFER_EXP, executionEvent));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
phEventWaitList, UR_COMMAND_ENQUEUE_COMMAND_BUFFER_EXP, executionEvent));
phEventWaitList, UR_COMMAND_ENQUEUE_COMMAND_BUFFER_EXP, nullptr));

We can change this to nullptr, since we know for sure this event is already signaled (completed). No need to add it to the waitlist of the command.

@@ -98,7 +98,7 @@ struct urCommandBufferExpExecutionTest : uur::urKernelExecutionTest {
UUR_RETURN_ON_FATAL_FAILURE(checkCommandBufferSupport(device));

ur_exp_command_buffer_desc_t desc{UR_STRUCTURE_TYPE_EXP_COMMAND_BUFFER_DESC,
nullptr, false, false, false};
nullptr, false, false /* WAS: false */, false};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
nullptr, false, false /* WAS: false */, false};
nullptr, false, false, false};

}

ur_queue_handle_t in_order_queue = nullptr;
ur_queue_handle_t out_of_order_queue = nullptr;
ur_queue_handle_t in_or_out_of_order_queue = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just call it queue.

Copy link
Contributor Author

@EuphoricThinking EuphoricThinking May 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is already a queue in the parent class (urQueueTestWithParam). This queue is used in SerializeAcrossQueues

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we use the one from the parent class then? I think we can just modify queue_properties BEFORE calling parent SetUp (UUR_RETURN_ON_FATAL_FAILURE( urCommandBufferExpExecutionTestWithParam::SetUp());) to make it in-order or out-of-order

add new test fixture
test serialization of the command buffer execution
in an in-order queue

static constexpr int num_copy_buffers = 10;
static constexpr int buffer_size = 512;
int *dst_buffers[num_copy_buffers];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenCL tests are failing:

/__w/llvm/llvm/unified-runtime/test/conformance/exp_command_buffer/fixtures.h:25: Skipped
EXP command-buffer feature is not supported.
/__w/llvm/llvm/unified-runtime/test/conformance/exp_command_buffer/enqueue.cpp:93: Failure
Expected equality of these values:
  uur::Result(UR_RESULT_SUCCESS)
    Which is: UR_RESULT_SUCCESS
  uur::Result(urUSMFree(context, src_buffers[i]))
    Which is: UR_RESULT_ERROR_INVALID_VALUE

You might need to zero initialize these buffers (add = {} to both arrays);

(void **)&(src_buffers[i])));

ASSERT_SUCCESS(urEnqueueUSMMemcpy(
in_or_out_of_order_queue, false, src_buffers[i], temp_val.data(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's odd submitting this to in_or_out_of_order_queue when the calls before and after are to queue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it look better now? c:

@@ -45,6 +39,22 @@ struct urEnqueueCommandBufferExpTest
ASSERT_SUCCESS(urEnqueueUSMFill(queue, device_ptr, sizeof(zero_pattern),
&zero_pattern, allocation_size, 0, nullptr,
nullptr));

std::vector<int> temp_val(buffer_size, 3);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preference for using urEnqueueUSMFill on line 53 than initializing then copying a host vector of a constant value.

static constexpr int num_copy_buffers = 10;
static constexpr int buffer_size = 512;
int *dst_buffers[num_copy_buffers];
int *src_buffers[num_copy_buffers];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

preference for fixed sized type like int32_t, whereas int is system dependent.

ur_queue_handle_t in_order_queue = nullptr;
ur_queue_handle_t out_of_order_queue = nullptr;
ur_queue_handle_t in_or_out_of_order_queue = nullptr;
ur_queue_flags_t queue_type;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be a class member variable or could we just define it on line 28?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, once upon a time I've thought this looks better, but these times are gone.

synchronization implemented as the host thread waiting
for an event to be signalled after the completion of the command
buffer execution
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants