Skip to content

Commit 63a5241

Browse files
committed
add 45
1 parent 049b40d commit 63a5241

23 files changed

+4255
-30
lines changed

.github/workflows/test-libbpf.yml

+4
Original file line numberDiff line numberDiff line change
@@ -81,3 +81,7 @@ jobs:
8181
- name: test 43 kfuncs
8282
run: |
8383
make -C src/43-kfuncs
84+
85+
- name: test 44
86+
run: |
87+
make -C src/43-kfuncs

README.md

+1
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,7 @@ Security:
7676
Scheduler:
7777

7878
- [lesson 44-scx-simple](src/44-scx-simple/README.md) Introduction to the BPF Scheduler
79+
- [lesson 45-scx-nest](src/45-scx-nest/README.md) Implementing the `scx_nest` Scheduler
7980

8081
Other:
8182

src/44-scx-simple/.gitignore

+1
Original file line numberDiff line numberDiff line change
@@ -1 +1,2 @@
11
scx_simple
2+
.output

src/44-scx-simple/README.md

+43-26
Original file line numberDiff line numberDiff line change
@@ -23,18 +23,11 @@ The **scx_simple** scheduler is a straightforward example of a sched_ext schedul
2323
1. **Global Weighted Virtual Time (vtime) Mode:** Prioritizes tasks based on their virtual time, allowing for fair scheduling across different workloads.
2424
2. **FIFO (First-In-First-Out) Mode:** Simple queue-based scheduling where tasks are executed in the order they arrive.
2525

26-
### Use Case and Suitability
27-
2826
scx_simple is particularly effective on single-socket CPUs with a uniform L3 cache topology. While the global FIFO mode can handle many workloads efficiently, it's essential to note that saturating threads might overshadow less active ones. Therefore, scx_simple is best suited for environments where a straightforward scheduling policy meets the performance and fairness requirements.
2927

30-
### Production Readiness
31-
32-
While scx_simple is minimalistic, it can be deployed in production settings under the right conditions:
33-
34-
- **Hardware Constraints:** Best suited for systems with single-socket CPUs and uniform cache architectures.
35-
- **Workload Characteristics:** Ideal for workloads that don't require intricate scheduling policies and can benefit from simple FIFO or weighted vtime scheduling.
28+
While scx_simple is minimalistic, it can be deployed in production settings under the right conditions. It is best suited for systems with single-socket CPUs and uniform cache architectures. Additionally, it is ideal for workloads that don't require intricate scheduling policies and can benefit from simple FIFO or weighted vtime scheduling.
3629

37-
## Diving into the Code: Kernel and User-Space Analysis
30+
## Into the Code: Kernel and User-Space Analysis
3831

3932
Let's explore how scx_simple is implemented both in the kernel and user-space. We'll start by presenting the complete code snippets and then break down their functionalities.
4033

@@ -280,6 +273,8 @@ restart:
280273
}
281274
```
282275

276+
The complete code can be found in <https://github.com/eunomia-bpf/bpf-developer-tutorial>
277+
283278
#### User-Space Breakdown
284279

285280
The user-space component is responsible for interacting with the BPF scheduler, managing its lifecycle, and monitoring its performance. Here's a snapshot of its responsibilities:
@@ -315,22 +310,43 @@ Virtual time is a mechanism to ensure fairness in scheduling by tracking how muc
315310

316311
### Scheduling Cycle
317312

318-
Understanding the scheduling cycle is crucial for modifying or extending scx_simple:
319-
320-
1. **Task Wakeup:**
321-
- `ops.select_cpu()` is invoked to select an optimal CPU for the waking task.
322-
- If the selected CPU is idle, the task is dispatched immediately to the local DSQ.
323-
324-
2. **Task Enqueueing:**
325-
- `ops.enqueue()` decides whether to dispatch the task to the global DSQ, a local DSQ, or a custom DSQ based on the scheduling mode.
326-
327-
3. **Task Dispatching:**
328-
- When a CPU is ready to schedule, it first checks its local DSQ, then the global DSQ, and finally invokes `ops.dispatch()` if needed.
329-
330-
4. **Task Execution:**
331-
- The CPU executes the selected task, updating its virtual time and ensuring fair scheduling.
332-
333-
This cycle ensures that tasks are scheduled efficiently while maintaining fairness and responsiveness.
313+
Understanding the scheduling cycle is crucial for modifying or extending scx_simple. The following steps detail how a waking task is scheduled and executed:
314+
315+
1. **Task Wakeup and CPU Selection:**
316+
- When a task wakes up, the first operation invoked is `ops.select_cpu()`.This function serves two purposes:
317+
- **CPU Selection Optimization Hint:** Provides a suggested CPU for the task to run on. While this is an optimization hint and not binding, matching the CPU the task eventually runs on can yield performance gains.
318+
- **Waking Up Idle CPUs:** If the selected CPU is idle, `ops.select_cpu()` can wake it up, preparing it to execute tasks.
319+
- Note: The scheduler core will ignore invalid CPU selections, such as CPUs outside the allowed CPU mask of the task.
320+
321+
2. **Immediate Dispatch from `ops.select_cpu()`:**
322+
- A task can be immediately dispatched to a Dispatch Queue (DSQ) directly from `ops.select_cpu()` by calling `scx_bpf_dispatch()`.
323+
- If dispatched to `SCX_DSQ_LOCAL`, the task will be placed in the local DSQ of the CPU returned by `ops.select_cpu()`.
324+
- Dispatching directly from `ops.select_cpu()` causes the `ops.enqueue()` callback to be skipped, potentially reducing scheduling latency.
325+
326+
3. **Task Enqueueing (`ops.enqueue()`):**
327+
- If the task was not dispatched in the previous step, `ops.enqueue()` is invoked.
328+
- `ops.enqueue()` can make several decisions:
329+
- **Immediate Dispatch:** Dispatch the task to either the global DSQ (`SCX_DSQ_GLOBAL`), a local DSQ (`SCX_DSQ_LOCAL`), or a custom DSQ by calling `scx_bpf_dispatch()`.
330+
- **Queue on BPF Side:** Queue the task within the BPF program for custom scheduling logic.
331+
332+
4. **CPU Scheduling Readiness:**
333+
- When a CPU is ready to schedule, it follows this order:
334+
- **Local DSQ Check:** The CPU first checks its local DSQ for tasks.
335+
- **Global DSQ Check:** If the local DSQ is empty, it checks the global DSQ.
336+
- **Invoke `ops.dispatch()`:** If no tasks are found, `ops.dispatch()` is invoked to populate the local DSQ.
337+
- Within `ops.dispatch()`, the following functions can be used:
338+
- `scx_bpf_dispatch()`: Schedules tasks to any DSQ (local, global, or custom). Note that this function currently cannot be called with BPF locks held.
339+
- `scx_bpf_consume()`: Transfers a task from a specified non-local DSQ to the dispatching DSQ. This function cannot be called with any BPF locks held and will flush pending dispatched tasks before attempting to consume the specified DSQ.
340+
341+
5. **Task Execution Decision:**
342+
- After `ops.dispatch()` returns, if there are tasks in the local DSQ, the CPU runs the first one.
343+
- If the local DSQ is still empty, the CPU performs the following steps:
344+
- **Consume Global DSQ:** Attempts to consume a task from the global DSQ using `scx_bpf_consume()`. If successful, the task is executed.
345+
- **Retry Dispatch:** If `ops.dispatch()` has dispatched any tasks, the CPU retries checking the local DSQ.
346+
- **Execute Previous Task:** If the previous task is an SCX task and still runnable, the CPU continues executing it (see `SCX_OPS_ENQ_LAST`).
347+
- **Idle State:** If no tasks are available, the CPU goes idle.
348+
349+
This scheduling cycle ensures that tasks are scheduled efficiently while maintaining fairness and responsiveness. By understanding each step, developers can modify or extend scx_simple to implement custom scheduling behaviors that meet specific requirements.
334350

335351
## Compiling and Running scx_simple
336352

@@ -412,7 +428,7 @@ In this tutorial, we've introduced the **sched_ext** scheduler class and walked
412428

413429
By mastering scx_simple, you're well-equipped to design and implement more sophisticated scheduling policies tailored to your specific requirements. Whether you're optimizing for performance, fairness, or specific workload characteristics, sched_ext and eBPF offer the flexibility and power to achieve your goals.
414430

415-
> Ready to take your eBPF skills to the next level? Dive deeper into our tutorials and explore more examples by visiting our [tutorial repository](https://github.com/eunomia-bpf/bpf-developer-tutorial) or our [website](https://eunomia.dev/tutorials/).
431+
> Ready to take your eBPF skills to the next level? Dive deeper into our tutorials and explore more examples by visiting our [tutorial repository https://github.com/eunomia-bpf/bpf-developer-tutorial](https://github.com/eunomia-bpf/bpf-developer-tutorial) or our [website https://eunomia.dev/tutorials/](https://eunomia.dev/tutorials/).
416432
417433
## References
418434

@@ -423,3 +439,4 @@ By mastering scx_simple, you're well-equipped to design and implement more sophi
423439
- **libbpf Documentation:** [https://github.com/libbpf/libbpf](https://github.com/libbpf/libbpf)
424440

425441
Feel free to explore these resources to expand your understanding and continue your journey into advanced eBPF programming!
442+

0 commit comments

Comments
 (0)