|
| 1 | +.. _llext_debug: |
| 2 | + |
| 3 | +Debugging extensions |
| 4 | +#################### |
| 5 | + |
| 6 | +Debugging extensions is a complex task. Since the extension code is by |
| 7 | +definition not built with the Zephyr application, the final Zephyr ELF file |
| 8 | +does not contain the symbols for extension code. Furthermore, the extension is |
| 9 | +dynamically relocated by :c:func:`llext_load` at runtime, so even if the |
| 10 | +symbols were available, it would be impossible for the debugger to know the |
| 11 | +final locations of the symbols in the extension code. |
| 12 | + |
| 13 | +Setting up the debugger session properly in this case requires a few manual |
| 14 | +steps. The following sections will provide some tips on how to do it with the |
| 15 | +Zephyr SDK and the debug features provided by ``west``, but the instructions |
| 16 | +can be adapted to any GDB-based debugging environment. |
| 17 | + |
| 18 | +Extension debugging process |
| 19 | +=========================== |
| 20 | + |
| 21 | +1. Make sure the project is set up to display the verbose LLEXT debug output |
| 22 | + (:kconfig:option:`CONFIG_LOG` and :kconfig:option:`CONFIG_LLEXT_LOG_LEVEL_DBG` |
| 23 | + are set). |
| 24 | + |
| 25 | +2. Build the Zephyr application and the extensions. |
| 26 | + |
| 27 | + For each target ``name`` included in the current build, two files will be |
| 28 | + generated into the ``llext`` subdirectory of the build root: |
| 29 | + |
| 30 | + ``name_ext_debug.elf`` |
| 31 | + |
| 32 | + An intermediate ELF file with full debugging information. |
| 33 | + |
| 34 | + ``name.llext`` |
| 35 | + |
| 36 | + The final extension binary, stripped to the essential data required for |
| 37 | + loading into the Zephyr application. |
| 38 | + |
| 39 | + Other files may be present, depending on the target architecture and the |
| 40 | + build configuration. |
| 41 | + |
| 42 | +3. Start a debugging session of the main Zephyr application. This is described |
| 43 | + in the :ref:`Debugging <west-debugging>` section of the documentation; on |
| 44 | + supported boards it is as easy as running ``west debug``, perhaps with some |
| 45 | + additional arguments. |
| 46 | + |
| 47 | +4. Set a breakpoint just after the :c:func:`llext_load` function in your code |
| 48 | + and let it run. This will load the extension into memory and relocate it. |
| 49 | + The output logs will contain a line with ``gdb add-symbol-file flags:``, |
| 50 | + followed by lines all starting with ``-s``. |
| 51 | + |
| 52 | +5. Type this command in the GDB console to load this extension's symbols: |
| 53 | + |
| 54 | + .. code-block:: |
| 55 | +
|
| 56 | + add-symbol-file <path-to-debug.elf> <load-addresses> |
| 57 | +
|
| 58 | + where ``<path-to-debug.elf>`` is the full path of the ELF file with debug |
| 59 | + information identified in step 2, and ``<load-addresses>`` is a space |
| 60 | + separated list of all the ``-s`` lines collected from the log in the |
| 61 | + previous step. |
| 62 | + |
| 63 | +6. The extension symbols are now available to the debugger. You can set |
| 64 | + breakpoints, inspect variables, and step through the code as usual. |
| 65 | + |
| 66 | +Steps 4-6 can be repeated for every extension that is loaded by the |
| 67 | +application, if there are several. |
| 68 | + |
| 69 | +Symbol lookup issues |
| 70 | +==================== |
| 71 | + |
| 72 | +.. warning:: |
| 73 | + |
| 74 | + It is almost certain that the loaded symbols will be shadowed by others in |
| 75 | + the main application; for example, they may be located inside the memory |
| 76 | + area of the ELF buffer or the LLEXT heap. |
| 77 | + |
| 78 | + In this case GDB chooses the first known symbol and therefore associates the |
| 79 | + addresses to some ``elf_buffer+0x123`` instead of an expected ``ext_fn``. |
| 80 | + This further confuses its high-level operations like source stepping or |
| 81 | + inspecting locals, since they are meaningless in that context. |
| 82 | + |
| 83 | +Two possible solutions to this problem are discussed in the following |
| 84 | +paragraphs. |
| 85 | + |
| 86 | +Discard all Zephyr symbols |
| 87 | +-------------------------- |
| 88 | + |
| 89 | +The simplest option is to drop all the Zephyr application symbols from GDB by |
| 90 | +invoking ``add-symbol-file`` with no arguments, before step 5. This will |
| 91 | +however focus the debugging session to the llext only, as all information about |
| 92 | +the Zephyr application will be lost. For example, the debugger may not be able to |
| 93 | +properly follow stack traces outside the extension code. |
| 94 | + |
| 95 | +It is possible to use the same technique multiple times in the same session to |
| 96 | +switch between the main and extension symbol tables as required, but it rapidly |
| 97 | +becomes cumbersome. |
| 98 | + |
| 99 | +Edit the ELF file |
| 100 | +----------------- |
| 101 | + |
| 102 | +This alternative is more complex but allows for a more seamless debugging |
| 103 | +experience. The idea is to edit the main Zephyr ELF file to remove information |
| 104 | +about the symbols that overlap with the extension that is to be debugged, so |
| 105 | +that when the extension symbols are loaded, GDB will not have any ambiguity. |
| 106 | +This can be done by using ``objcopy`` with the ``-N <symbol>`` option. |
| 107 | + |
| 108 | +Identifying the offending symbols is however an iterative trial-and-error |
| 109 | +procedure, as there can be many different layers; for example, the ELF buffer |
| 110 | +may be itself contained in a symbol for the data segment. Fortunately, this |
| 111 | +knowledge can then be used several times as the list is unlikely to change for |
| 112 | +a given project. |
| 113 | + |
| 114 | +Example debugging session |
| 115 | +========================= |
| 116 | + |
| 117 | +This example demonstrates how to debug the ``detached_fn`` extension in the |
| 118 | +``tests/subsys/llext`` project (specifically, the ``writable`` case), on an |
| 119 | +emulated ``mps2/an385`` board which is based on an ARM Cortex-M3. |
| 120 | + |
| 121 | +.. note:: |
| 122 | + |
| 123 | + The logs below have been obtained using Zephyr version 4.1 and the Zephyr |
| 124 | + SDK version 0.17.0. However, the exact addresses may still vary between |
| 125 | + runs even when using the same versions. Adjust the commands below to |
| 126 | + match the results of your own session. |
| 127 | + |
| 128 | +The following command will build the project and start the emulator in |
| 129 | +debugging mode: |
| 130 | + |
| 131 | +.. code-block:: |
| 132 | + :caption: Terminal 1 (build, QEMU emulator, GDB server) |
| 133 | +
|
| 134 | + zephyr$ west build -p -b mps2/an385 tests/subsys/llext/ -T llext.writable -t debugserver_qemu |
| 135 | + -- west build: generating a build system |
| 136 | + [...] |
| 137 | + -- west build: running target debugserver_qemu |
| 138 | + [...] |
| 139 | + [186/187] To exit from QEMU enter: 'CTRL+a, x'[QEMU] CPU: cortex-m3 |
| 140 | +
|
| 141 | +On a separate terminal, set ``ZEPHYR_SDK_INSTALL_DIR`` to the directory for the |
| 142 | +Zephyr SDK on your installation, then start the GDB client for the target: |
| 143 | + |
| 144 | +.. code-block:: |
| 145 | + :caption: Terminal 2 (GDB client) |
| 146 | +
|
| 147 | + zephyr$ export LLEXT_SDK_INSTALL_DIR=/opt/zephyr-sdk-0.17.0 |
| 148 | + zephyr$ ${LLEXT_SDK_INSTALL_DIR}/arm-zephyr-eabi/bin/arm-zephyr-eabi-gdb build/zephyr/zephyr.elf |
| 149 | + GNU gdb (Zephyr SDK 0.17.0) 12.1 |
| 150 | + [...] |
| 151 | + Reading symbols from build/zephyr/zephyr.elf... |
| 152 | + (gdb) |
| 153 | +
|
| 154 | +Connect, set a breakpoint on the ``llext_load`` function and run until it |
| 155 | +finishes: |
| 156 | + |
| 157 | +.. code-block:: |
| 158 | + :caption: Terminal 2 (GDB client) |
| 159 | +
|
| 160 | + (gdb) target extended-remote :1234 |
| 161 | + Remote debugging using :1234 |
| 162 | + z_arm_reset () at zephyr/arch/arm/core/cortex_m/reset.S:124 |
| 163 | + 124 movs.n r0, #_EXC_IRQ_DEFAULT_PRIO |
| 164 | + (gdb) break llext_load |
| 165 | + Breakpoint 1 at 0x236c: file zephyr/subsys/llext/llext.c, line 168. |
| 166 | + (gdb) continue |
| 167 | + Continuing. |
| 168 | +
|
| 169 | + Breakpoint 1, llext_load (ldr=ldr@entry=0x2000bef0 <ztest_thread_stack+3488>, |
| 170 | + name=name@entry=0x9d98 "test_detached", |
| 171 | + ext=ext@entry=0x2000abb8 <detached_llext>, |
| 172 | + ldr_parm=ldr_parm@entry=0x2000bee8 <ztest_thread_stack+3480>) |
| 173 | + at zephyr/subsys/llext/llext.c:168 |
| 174 | + 168 *ext = llext_by_name(name); |
| 175 | + (gdb) finish |
| 176 | + Run till exit from #0 llext_load ([...]) |
| 177 | + at zephyr/subsys/llext/llext.c:168 |
| 178 | + llext_test_detached () at zephyr/tests/subsys/llext/src/test_llext.c:481 |
| 179 | + 481 zassert_ok(res, "load should succeed"); |
| 180 | +
|
| 181 | +The first terminal will have printed lots of debugging information related to |
| 182 | +the extension loading. Find the section with the addresses: |
| 183 | + |
| 184 | +.. code-block:: |
| 185 | + :caption: Terminal 1 (build, QEMU emulator, GDB server) |
| 186 | +
|
| 187 | + [...] |
| 188 | + D: Allocate and copy regions... |
| 189 | + [...] |
| 190 | + D: gdb add-symbol-file flags: |
| 191 | + D: -s .text 0x20000034 |
| 192 | + D: -s .data 0x200000b4 |
| 193 | + D: -s .bss 0x2000c2e0 |
| 194 | + D: -s .rodata 0x200000b8 |
| 195 | + D: -s .detach 0x200001d0 |
| 196 | + D: Counting exported symbols... |
| 197 | + [...] |
| 198 | +
|
| 199 | +Use these addresses to load the symbols into GDB: |
| 200 | + |
| 201 | +.. code-block:: |
| 202 | + :caption: Terminal 2 (GDB client) |
| 203 | +
|
| 204 | + (gdb) add-symbol-file build/llext/detached_fn_ext_debug.elf -s .text 0x20000034 -s .data 0x200000b4 -s .bss 0x2000c2e0 -s .rodata 0x200000b8 -s .detach 0x200001d0 |
| 205 | + add symbol table from file "build/llext/detached_fn_ext_debug.elf" at |
| 206 | + .text_addr = 0x20000034 |
| 207 | + .data_addr = 0x200000b4 |
| 208 | + .bss_addr = 0x2000c2e0 |
| 209 | + .rodata_addr = 0x200000b8 |
| 210 | + .detach_addr = 0x200001d0 |
| 211 | + (y or n) y |
| 212 | + Reading symbols from build/llext/detached_fn_ext_debug.elf... |
| 213 | + (gdb) break detached_entry |
| 214 | + Breakpoint 2 at 0x200001d0 (2 locations) |
| 215 | + (gdb) continue |
| 216 | + Continuing. |
| 217 | +
|
| 218 | + Breakpoint 2, 0x200001d0 in test_detached_ext () |
| 219 | + (gdb) backtrace |
| 220 | + #0 0x200001d0 in test_detached_ext () |
| 221 | + #1 0x200000ac in test_detached_ext () |
| 222 | + #2 0x00000706 in llext_test_detached () at zephyr/tests/subsys/llext/src/test_llext.c:496 |
| 223 | + #3 0x00001a36 in run_test_functions (suite=0x92bc <z_ztest_test_node_llext>, data=0x0 <cbvprintf_package>, test=0x92d8 <z_ztest_unit_test.llext.test_detached>) at zephyr/subsys/testsuite/ztest/src/ztest.c:328 |
| 224 | + #4 test_cb (a=0x92bc <z_ztest_test_node_llext>, b=0x92d8 <z_ztest_unit_test.llext.test_detached>, c=0x0 <cbvprintf_package>) at zephyr/subsys/testsuite/ztest/src/ztest.c:662 |
| 225 | + #5 0x00000e96 in z_thread_entry (entry=0x1a05 <test_cb>, p1=0x92bc <z_ztest_test_node_llext>, p2=0x92d8 <z_ztest_unit_test.llext.test_detached>, p3=0x0 <cbvprintf_package>) at zephyr/lib/os/thread_entry.c:48 |
| 226 | + #6 0x00000000 in ?? () |
| 227 | +
|
| 228 | +The symbol associated with the breakpoint location and the last stack frames |
| 229 | +mistakenly reference the ELF buffer in the Zephyr application instead of the |
| 230 | +extension symbols. Note that GDB however knows both: |
| 231 | + |
| 232 | +.. code-block:: |
| 233 | + :caption: Terminal 2 (GDB client) |
| 234 | +
|
| 235 | + (gdb) info sym 0x200001d0 |
| 236 | + test_detached_ext + 464 in section datas of zephyr/build/zephyr/zephyr.elf |
| 237 | + detached_entry in section .detach of zephyr/build/llext/detached_fn_ext_debug.elf |
| 238 | + (gdb) info sym 0x200000ac |
| 239 | + test_detached_ext + 172 in section datas of zephyr/build/zephyr/zephyr.elf |
| 240 | + test_entry + 8 in section .text of zephyr/build/llext/detached_fn_ext_debug.elf |
| 241 | +
|
| 242 | +It is also impossible to inspect the variables in the extension or step through |
| 243 | +code properly: |
| 244 | + |
| 245 | +.. code-block:: |
| 246 | + :caption: Terminal 2 (GDB client) |
| 247 | +
|
| 248 | + (gdb) print bss_cnt |
| 249 | + No symbol "bss_cnt" in current context. |
| 250 | + (gdb) print data_cnt |
| 251 | + No symbol "data_cnt" in current context. |
| 252 | + (gdb) next |
| 253 | + Single stepping until exit from function test_detached_ext, |
| 254 | + which has no line number information. |
| 255 | +
|
| 256 | + Breakpoint 2, 0x200001ea in test_detached_ext () |
| 257 | + (gdb) |
| 258 | +
|
| 259 | +Discarding symbols |
| 260 | +------------------ |
| 261 | + |
| 262 | +Discarding the Zephyr symbols and only focusing on the extension restores full |
| 263 | +debugging functionality at the cost of losing the global context (note the |
| 264 | +backtrace stops outside the extension): |
| 265 | + |
| 266 | +.. code-block:: |
| 267 | + :caption: Terminal 2 (GDB client) |
| 268 | +
|
| 269 | + (gdb) symbol-file |
| 270 | + Discard symbol table from `zephyr/build/zephyr/zephyr.elf'? (y or n) y |
| 271 | + Error in re-setting breakpoint 1: No symbol table is loaded. Use the "file" command. |
| 272 | + No symbol file now. |
| 273 | + (gdb) add-symbol-file build/llext/detached_fn_ext_debug.elf -s .text 0x20000034 -s .data 0x200000b4 -s .bss 0x2000c2e0 -s .rodata 0x200000b8 -s .detach 0x200001d0 |
| 274 | + add symbol table from file "build/llext/detached_fn_ext_debug.elf" at |
| 275 | + .text_addr = 0x20000034 |
| 276 | + .data_addr = 0x200000b4 |
| 277 | + .bss_addr = 0x2000c2e0 |
| 278 | + .rodata_addr = 0x200000b8 |
| 279 | + .detach_addr = 0x200001d0 |
| 280 | + (y or n) y |
| 281 | + Reading symbols from build/llext/detached_fn_ext_debug.elf... |
| 282 | + (gdb) backtrace |
| 283 | + #0 detached_entry () at zephyr/tests/subsys/llext/src/detached_fn_ext.c:18 |
| 284 | + #1 0x200000ac in test_entry () at zephyr/tests/subsys/llext/src/detached_fn_ext.c:26 |
| 285 | + #2 0x00000706 in ?? () |
| 286 | + Backtrace stopped: previous frame identical to this frame (corrupt stack?) |
| 287 | + (gdb) next |
| 288 | + 19 zassert_true(data_cnt < 0); |
| 289 | + (gdb) print bss_cnt |
| 290 | + $1 = 1 |
| 291 | + (gdb) print data_cnt |
| 292 | + $2 = -2 |
| 293 | + (gdb) |
| 294 | +
|
| 295 | +
|
| 296 | +Editing the ELF file |
| 297 | +-------------------- |
| 298 | + |
| 299 | +In this alternative approach, the patches to the Zephyr ELF file must be |
| 300 | +performed after building the Zephyr binary and starting the emulator on |
| 301 | +Terminal 1, but before starting the GDB client on Terminal 2. |
| 302 | + |
| 303 | +The above debugging session already identified ``test_detached_ext``, the char |
| 304 | +array that holds the ELF file, as an offending symbol, so that will be removed |
| 305 | +in a first pass. Performing the same steps multiple times, ``__data_start`` and |
| 306 | +``__data_region_start`` can also be found to overlap the memory area of |
| 307 | +interest. |
| 308 | + |
| 309 | +The following commands will remove all of these from the Zephyr ELF file, then |
| 310 | +start a debugging session on the modified file: |
| 311 | + |
| 312 | +.. code-block:: |
| 313 | + :caption: Terminal 2 (GDB client) |
| 314 | +
|
| 315 | + zephyr$ export LLEXT_SDK_INSTALL_DIR=/opt/zephyr-sdk-0.17.0 |
| 316 | + zephyr$ ${LLEXT_SDK_INSTALL_DIR}/arm-zephyr-eabi/bin/arm-zephyr-eabi-objcopy -N test_detached_ext -N __data_start -N __data_region_start build/zephyr/zephyr.elf build/zephyr/zephyr-edit.elf |
| 317 | + zephyr$ ${LLEXT_SDK_INSTALL_DIR}/arm-zephyr-eabi/bin/arm-zephyr-eabi-gdb build/zephyr/zephyr-edit.elf |
| 318 | + GNU gdb (Zephyr SDK 0.17.0) 12.1 |
| 319 | + [...] |
| 320 | + Reading symbols from build/zephyr/zephyr-edit.elf... |
| 321 | + (gdb) |
| 322 | +
|
| 323 | +The same steps used in the previous run can be performed again to attach to the |
| 324 | +GDB server and load both the extension and its debug symbols. This time, however, |
| 325 | +the result is rather different: |
| 326 | + |
| 327 | + * the ``break`` command includes line number information; |
| 328 | + |
| 329 | + * the output from ``backtrace`` contains functions from both the extension and |
| 330 | + the Zephyr application; |
| 331 | + |
| 332 | + * the local variables can be properly inspected. |
| 333 | + |
| 334 | +.. code-block:: |
| 335 | + :caption: Terminal 2 (GDB client) |
| 336 | +
|
| 337 | + (gdb) add-symbol-file build/llext/detached_fn_ext_debug.elf [...] |
| 338 | + [...] |
| 339 | + Reading symbols from build/llext/detached_fn_ext_debug.elf... |
| 340 | + (gdb) break detached_entry |
| 341 | + Breakpoint 2 at 0x200001d6: file zephyr/tests/subsys/llext/src/detached_fn_ext.c, line 17. |
| 342 | + (gdb) continue |
| 343 | + Continuing. |
| 344 | +
|
| 345 | + Breakpoint 2, detached_entry () at zephyr/tests/subsys/llext/src/detached_fn_ext.c:17 |
| 346 | + 17 printk("bss %u @ %p\n", bss_cnt++, &bss_cnt); |
| 347 | + (gdb) backtrace |
| 348 | + #0 detached_entry () at zephyr/tests/subsys/llext/src/detached_fn_ext.c:17 |
| 349 | + #1 0x200000ac in test_entry () at zephyr/tests/subsys/llext/src/detached_fn_ext.c:26 |
| 350 | + #2 0x00000706 in llext_test_detached () at zephyr/tests/subsys/llext/src/test_llext.c:496 |
| 351 | + #3 0x00001a36 in run_test_functions (suite=0x92bc <z_ztest_test_node_llext>, data=0x0 <cbvprintf_package>, test=0x92d8 <z_ztest_unit_test.llext.test_detached>) at zephyr/subsys/testsuite/ztest/src/ztest.c:328 |
| 352 | + #4 test_cb (a=0x92bc <z_ztest_test_node_llext>, b=0x92d8 <z_ztest_unit_test.llext.test_detached>, c=0x0 <cbvprintf_package>) at zephyr/subsys/testsuite/ztest/src/ztest.c:662 |
| 353 | + #5 0x00000e96 in z_thread_entry (entry=0x1a05 <test_cb>, p1=0x92bc <z_ztest_test_node_llext>, p2=0x92d8 <z_ztest_unit_test.llext.test_detached>, p3=0x0 <cbvprintf_package>) at zephyr/lib/os/thread_entry.c:48 |
| 354 | + #6 0x00000000 in ?? () |
| 355 | + (gdb) print bss_cnt |
| 356 | + $1 = 0 |
| 357 | + (gdb) print data_cnt |
| 358 | + $2 = -3 |
| 359 | + (gdb) |
0 commit comments