Skip to content

Commit 00ccbce

Browse files
pillo79kartben
authored andcommitted
doc: llext: add extension debugging guide
Add a new section to the llext documentation that explains how to debug extensions and how to address the issues that may arise when doing so. Signed-off-by: Luca Burelli <[email protected]>
1 parent 8660020 commit 00ccbce

File tree

3 files changed

+361
-7
lines changed

3 files changed

+361
-7
lines changed

doc/services/llext/debug.rst

+359
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,359 @@
1+
.. _llext_debug:
2+
3+
Debugging extensions
4+
####################
5+
6+
Debugging extensions is a complex task. Since the extension code is by
7+
definition not built with the Zephyr application, the final Zephyr ELF file
8+
does not contain the symbols for extension code. Furthermore, the extension is
9+
dynamically relocated by :c:func:`llext_load` at runtime, so even if the
10+
symbols were available, it would be impossible for the debugger to know the
11+
final locations of the symbols in the extension code.
12+
13+
Setting up the debugger session properly in this case requires a few manual
14+
steps. The following sections will provide some tips on how to do it with the
15+
Zephyr SDK and the debug features provided by ``west``, but the instructions
16+
can be adapted to any GDB-based debugging environment.
17+
18+
Extension debugging process
19+
===========================
20+
21+
1. Make sure the project is set up to display the verbose LLEXT debug output
22+
(:kconfig:option:`CONFIG_LOG` and :kconfig:option:`CONFIG_LLEXT_LOG_LEVEL_DBG`
23+
are set).
24+
25+
2. Build the Zephyr application and the extensions.
26+
27+
For each target ``name`` included in the current build, two files will be
28+
generated into the ``llext`` subdirectory of the build root:
29+
30+
``name_ext_debug.elf``
31+
32+
An intermediate ELF file with full debugging information.
33+
34+
``name.llext``
35+
36+
The final extension binary, stripped to the essential data required for
37+
loading into the Zephyr application.
38+
39+
Other files may be present, depending on the target architecture and the
40+
build configuration.
41+
42+
3. Start a debugging session of the main Zephyr application. This is described
43+
in the :ref:`Debugging <west-debugging>` section of the documentation; on
44+
supported boards it is as easy as running ``west debug``, perhaps with some
45+
additional arguments.
46+
47+
4. Set a breakpoint just after the :c:func:`llext_load` function in your code
48+
and let it run. This will load the extension into memory and relocate it.
49+
The output logs will contain a line with ``gdb add-symbol-file flags:``,
50+
followed by lines all starting with ``-s``.
51+
52+
5. Type this command in the GDB console to load this extension's symbols:
53+
54+
.. code-block::
55+
56+
add-symbol-file <path-to-debug.elf> <load-addresses>
57+
58+
where ``<path-to-debug.elf>`` is the full path of the ELF file with debug
59+
information identified in step 2, and ``<load-addresses>`` is a space
60+
separated list of all the ``-s`` lines collected from the log in the
61+
previous step.
62+
63+
6. The extension symbols are now available to the debugger. You can set
64+
breakpoints, inspect variables, and step through the code as usual.
65+
66+
Steps 4-6 can be repeated for every extension that is loaded by the
67+
application, if there are several.
68+
69+
Symbol lookup issues
70+
====================
71+
72+
.. warning::
73+
74+
It is almost certain that the loaded symbols will be shadowed by others in
75+
the main application; for example, they may be located inside the memory
76+
area of the ELF buffer or the LLEXT heap.
77+
78+
In this case GDB chooses the first known symbol and therefore associates the
79+
addresses to some ``elf_buffer+0x123`` instead of an expected ``ext_fn``.
80+
This further confuses its high-level operations like source stepping or
81+
inspecting locals, since they are meaningless in that context.
82+
83+
Two possible solutions to this problem are discussed in the following
84+
paragraphs.
85+
86+
Discard all Zephyr symbols
87+
--------------------------
88+
89+
The simplest option is to drop all the Zephyr application symbols from GDB by
90+
invoking ``add-symbol-file`` with no arguments, before step 5. This will
91+
however focus the debugging session to the llext only, as all information about
92+
the Zephyr application will be lost. For example, the debugger may not be able to
93+
properly follow stack traces outside the extension code.
94+
95+
It is possible to use the same technique multiple times in the same session to
96+
switch between the main and extension symbol tables as required, but it rapidly
97+
becomes cumbersome.
98+
99+
Edit the ELF file
100+
-----------------
101+
102+
This alternative is more complex but allows for a more seamless debugging
103+
experience. The idea is to edit the main Zephyr ELF file to remove information
104+
about the symbols that overlap with the extension that is to be debugged, so
105+
that when the extension symbols are loaded, GDB will not have any ambiguity.
106+
This can be done by using ``objcopy`` with the ``-N <symbol>`` option.
107+
108+
Identifying the offending symbols is however an iterative trial-and-error
109+
procedure, as there can be many different layers; for example, the ELF buffer
110+
may be itself contained in a symbol for the data segment. Fortunately, this
111+
knowledge can then be used several times as the list is unlikely to change for
112+
a given project.
113+
114+
Example debugging session
115+
=========================
116+
117+
This example demonstrates how to debug the ``detached_fn`` extension in the
118+
``tests/subsys/llext`` project (specifically, the ``writable`` case), on an
119+
emulated ``mps2/an385`` board which is based on an ARM Cortex-M3.
120+
121+
.. note::
122+
123+
The logs below have been obtained using Zephyr version 4.1 and the Zephyr
124+
SDK version 0.17.0. However, the exact addresses may still vary between
125+
runs even when using the same versions. Adjust the commands below to
126+
match the results of your own session.
127+
128+
The following command will build the project and start the emulator in
129+
debugging mode:
130+
131+
.. code-block::
132+
:caption: Terminal 1 (build, QEMU emulator, GDB server)
133+
134+
zephyr$ west build -p -b mps2/an385 tests/subsys/llext/ -T llext.writable -t debugserver_qemu
135+
-- west build: generating a build system
136+
[...]
137+
-- west build: running target debugserver_qemu
138+
[...]
139+
[186/187] To exit from QEMU enter: 'CTRL+a, x'[QEMU] CPU: cortex-m3
140+
141+
On a separate terminal, set ``ZEPHYR_SDK_INSTALL_DIR`` to the directory for the
142+
Zephyr SDK on your installation, then start the GDB client for the target:
143+
144+
.. code-block::
145+
:caption: Terminal 2 (GDB client)
146+
147+
zephyr$ export LLEXT_SDK_INSTALL_DIR=/opt/zephyr-sdk-0.17.0
148+
zephyr$ ${LLEXT_SDK_INSTALL_DIR}/arm-zephyr-eabi/bin/arm-zephyr-eabi-gdb build/zephyr/zephyr.elf
149+
GNU gdb (Zephyr SDK 0.17.0) 12.1
150+
[...]
151+
Reading symbols from build/zephyr/zephyr.elf...
152+
(gdb)
153+
154+
Connect, set a breakpoint on the ``llext_load`` function and run until it
155+
finishes:
156+
157+
.. code-block::
158+
:caption: Terminal 2 (GDB client)
159+
160+
(gdb) target extended-remote :1234
161+
Remote debugging using :1234
162+
z_arm_reset () at zephyr/arch/arm/core/cortex_m/reset.S:124
163+
124 movs.n r0, #_EXC_IRQ_DEFAULT_PRIO
164+
(gdb) break llext_load
165+
Breakpoint 1 at 0x236c: file zephyr/subsys/llext/llext.c, line 168.
166+
(gdb) continue
167+
Continuing.
168+
169+
Breakpoint 1, llext_load (ldr=ldr@entry=0x2000bef0 <ztest_thread_stack+3488>,
170+
name=name@entry=0x9d98 "test_detached",
171+
ext=ext@entry=0x2000abb8 <detached_llext>,
172+
ldr_parm=ldr_parm@entry=0x2000bee8 <ztest_thread_stack+3480>)
173+
at zephyr/subsys/llext/llext.c:168
174+
168 *ext = llext_by_name(name);
175+
(gdb) finish
176+
Run till exit from #0 llext_load ([...])
177+
at zephyr/subsys/llext/llext.c:168
178+
llext_test_detached () at zephyr/tests/subsys/llext/src/test_llext.c:481
179+
481 zassert_ok(res, "load should succeed");
180+
181+
The first terminal will have printed lots of debugging information related to
182+
the extension loading. Find the section with the addresses:
183+
184+
.. code-block::
185+
:caption: Terminal 1 (build, QEMU emulator, GDB server)
186+
187+
[...]
188+
D: Allocate and copy regions...
189+
[...]
190+
D: gdb add-symbol-file flags:
191+
D: -s .text 0x20000034
192+
D: -s .data 0x200000b4
193+
D: -s .bss 0x2000c2e0
194+
D: -s .rodata 0x200000b8
195+
D: -s .detach 0x200001d0
196+
D: Counting exported symbols...
197+
[...]
198+
199+
Use these addresses to load the symbols into GDB:
200+
201+
.. code-block::
202+
:caption: Terminal 2 (GDB client)
203+
204+
(gdb) add-symbol-file build/llext/detached_fn_ext_debug.elf -s .text 0x20000034 -s .data 0x200000b4 -s .bss 0x2000c2e0 -s .rodata 0x200000b8 -s .detach 0x200001d0
205+
add symbol table from file "build/llext/detached_fn_ext_debug.elf" at
206+
.text_addr = 0x20000034
207+
.data_addr = 0x200000b4
208+
.bss_addr = 0x2000c2e0
209+
.rodata_addr = 0x200000b8
210+
.detach_addr = 0x200001d0
211+
(y or n) y
212+
Reading symbols from build/llext/detached_fn_ext_debug.elf...
213+
(gdb) break detached_entry
214+
Breakpoint 2 at 0x200001d0 (2 locations)
215+
(gdb) continue
216+
Continuing.
217+
218+
Breakpoint 2, 0x200001d0 in test_detached_ext ()
219+
(gdb) backtrace
220+
#0 0x200001d0 in test_detached_ext ()
221+
#1 0x200000ac in test_detached_ext ()
222+
#2 0x00000706 in llext_test_detached () at zephyr/tests/subsys/llext/src/test_llext.c:496
223+
#3 0x00001a36 in run_test_functions (suite=0x92bc <z_ztest_test_node_llext>, data=0x0 <cbvprintf_package>, test=0x92d8 <z_ztest_unit_test.llext.test_detached>) at zephyr/subsys/testsuite/ztest/src/ztest.c:328
224+
#4 test_cb (a=0x92bc <z_ztest_test_node_llext>, b=0x92d8 <z_ztest_unit_test.llext.test_detached>, c=0x0 <cbvprintf_package>) at zephyr/subsys/testsuite/ztest/src/ztest.c:662
225+
#5 0x00000e96 in z_thread_entry (entry=0x1a05 <test_cb>, p1=0x92bc <z_ztest_test_node_llext>, p2=0x92d8 <z_ztest_unit_test.llext.test_detached>, p3=0x0 <cbvprintf_package>) at zephyr/lib/os/thread_entry.c:48
226+
#6 0x00000000 in ?? ()
227+
228+
The symbol associated with the breakpoint location and the last stack frames
229+
mistakenly reference the ELF buffer in the Zephyr application instead of the
230+
extension symbols. Note that GDB however knows both:
231+
232+
.. code-block::
233+
:caption: Terminal 2 (GDB client)
234+
235+
(gdb) info sym 0x200001d0
236+
test_detached_ext + 464 in section datas of zephyr/build/zephyr/zephyr.elf
237+
detached_entry in section .detach of zephyr/build/llext/detached_fn_ext_debug.elf
238+
(gdb) info sym 0x200000ac
239+
test_detached_ext + 172 in section datas of zephyr/build/zephyr/zephyr.elf
240+
test_entry + 8 in section .text of zephyr/build/llext/detached_fn_ext_debug.elf
241+
242+
It is also impossible to inspect the variables in the extension or step through
243+
code properly:
244+
245+
.. code-block::
246+
:caption: Terminal 2 (GDB client)
247+
248+
(gdb) print bss_cnt
249+
No symbol "bss_cnt" in current context.
250+
(gdb) print data_cnt
251+
No symbol "data_cnt" in current context.
252+
(gdb) next
253+
Single stepping until exit from function test_detached_ext,
254+
which has no line number information.
255+
256+
Breakpoint 2, 0x200001ea in test_detached_ext ()
257+
(gdb)
258+
259+
Discarding symbols
260+
------------------
261+
262+
Discarding the Zephyr symbols and only focusing on the extension restores full
263+
debugging functionality at the cost of losing the global context (note the
264+
backtrace stops outside the extension):
265+
266+
.. code-block::
267+
:caption: Terminal 2 (GDB client)
268+
269+
(gdb) symbol-file
270+
Discard symbol table from `zephyr/build/zephyr/zephyr.elf'? (y or n) y
271+
Error in re-setting breakpoint 1: No symbol table is loaded. Use the "file" command.
272+
No symbol file now.
273+
(gdb) add-symbol-file build/llext/detached_fn_ext_debug.elf -s .text 0x20000034 -s .data 0x200000b4 -s .bss 0x2000c2e0 -s .rodata 0x200000b8 -s .detach 0x200001d0
274+
add symbol table from file "build/llext/detached_fn_ext_debug.elf" at
275+
.text_addr = 0x20000034
276+
.data_addr = 0x200000b4
277+
.bss_addr = 0x2000c2e0
278+
.rodata_addr = 0x200000b8
279+
.detach_addr = 0x200001d0
280+
(y or n) y
281+
Reading symbols from build/llext/detached_fn_ext_debug.elf...
282+
(gdb) backtrace
283+
#0 detached_entry () at zephyr/tests/subsys/llext/src/detached_fn_ext.c:18
284+
#1 0x200000ac in test_entry () at zephyr/tests/subsys/llext/src/detached_fn_ext.c:26
285+
#2 0x00000706 in ?? ()
286+
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
287+
(gdb) next
288+
19 zassert_true(data_cnt < 0);
289+
(gdb) print bss_cnt
290+
$1 = 1
291+
(gdb) print data_cnt
292+
$2 = -2
293+
(gdb)
294+
295+
296+
Editing the ELF file
297+
--------------------
298+
299+
In this alternative approach, the patches to the Zephyr ELF file must be
300+
performed after building the Zephyr binary and starting the emulator on
301+
Terminal 1, but before starting the GDB client on Terminal 2.
302+
303+
The above debugging session already identified ``test_detached_ext``, the char
304+
array that holds the ELF file, as an offending symbol, so that will be removed
305+
in a first pass. Performing the same steps multiple times, ``__data_start`` and
306+
``__data_region_start`` can also be found to overlap the memory area of
307+
interest.
308+
309+
The following commands will remove all of these from the Zephyr ELF file, then
310+
start a debugging session on the modified file:
311+
312+
.. code-block::
313+
:caption: Terminal 2 (GDB client)
314+
315+
zephyr$ export LLEXT_SDK_INSTALL_DIR=/opt/zephyr-sdk-0.17.0
316+
zephyr$ ${LLEXT_SDK_INSTALL_DIR}/arm-zephyr-eabi/bin/arm-zephyr-eabi-objcopy -N test_detached_ext -N __data_start -N __data_region_start build/zephyr/zephyr.elf build/zephyr/zephyr-edit.elf
317+
zephyr$ ${LLEXT_SDK_INSTALL_DIR}/arm-zephyr-eabi/bin/arm-zephyr-eabi-gdb build/zephyr/zephyr-edit.elf
318+
GNU gdb (Zephyr SDK 0.17.0) 12.1
319+
[...]
320+
Reading symbols from build/zephyr/zephyr-edit.elf...
321+
(gdb)
322+
323+
The same steps used in the previous run can be performed again to attach to the
324+
GDB server and load both the extension and its debug symbols. This time, however,
325+
the result is rather different:
326+
327+
* the ``break`` command includes line number information;
328+
329+
* the output from ``backtrace`` contains functions from both the extension and
330+
the Zephyr application;
331+
332+
* the local variables can be properly inspected.
333+
334+
.. code-block::
335+
:caption: Terminal 2 (GDB client)
336+
337+
(gdb) add-symbol-file build/llext/detached_fn_ext_debug.elf [...]
338+
[...]
339+
Reading symbols from build/llext/detached_fn_ext_debug.elf...
340+
(gdb) break detached_entry
341+
Breakpoint 2 at 0x200001d6: file zephyr/tests/subsys/llext/src/detached_fn_ext.c, line 17.
342+
(gdb) continue
343+
Continuing.
344+
345+
Breakpoint 2, detached_entry () at zephyr/tests/subsys/llext/src/detached_fn_ext.c:17
346+
17 printk("bss %u @ %p\n", bss_cnt++, &bss_cnt);
347+
(gdb) backtrace
348+
#0 detached_entry () at zephyr/tests/subsys/llext/src/detached_fn_ext.c:17
349+
#1 0x200000ac in test_entry () at zephyr/tests/subsys/llext/src/detached_fn_ext.c:26
350+
#2 0x00000706 in llext_test_detached () at zephyr/tests/subsys/llext/src/test_llext.c:496
351+
#3 0x00001a36 in run_test_functions (suite=0x92bc <z_ztest_test_node_llext>, data=0x0 <cbvprintf_package>, test=0x92d8 <z_ztest_unit_test.llext.test_detached>) at zephyr/subsys/testsuite/ztest/src/ztest.c:328
352+
#4 test_cb (a=0x92bc <z_ztest_test_node_llext>, b=0x92d8 <z_ztest_unit_test.llext.test_detached>, c=0x0 <cbvprintf_package>) at zephyr/subsys/testsuite/ztest/src/ztest.c:662
353+
#5 0x00000e96 in z_thread_entry (entry=0x1a05 <test_cb>, p1=0x92bc <z_ztest_test_node_llext>, p2=0x92d8 <z_ztest_unit_test.llext.test_detached>, p3=0x0 <cbvprintf_package>) at zephyr/lib/os/thread_entry.c:48
354+
#6 0x00000000 in ?? ()
355+
(gdb) print bss_cnt
356+
$1 = 0
357+
(gdb) print data_cnt
358+
$2 = -3
359+
(gdb)

doc/services/llext/index.rst

+1
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ and introspected to some degree, as well as unloaded when no longer needed.
1616
config
1717
build
1818
load
19+
debug
1920
api
2021

2122
.. note::

doc/services/llext/load.rst

+1-7
Original file line numberDiff line numberDiff line change
@@ -94,13 +94,7 @@ If any of this happens, the following tips may help understand the issue:
9494
the issue.
9595

9696
* Use a debugger to inspect the memory and registers to try to understand what
97-
is happening.
98-
99-
.. note::
100-
When using GDB, the ``add_symbol_file`` command may be used to load the
101-
debugging information and symbols from the ELF file. Make sure to specify
102-
the proper offset (usually the start of the ``.text`` section, reported
103-
as ``region 0`` in the debug logs.)
97+
is happening. See :ref:`Debugging extensions <llext_debug>` for more details.
10498

10599
If the issue persists, please open an issue in the GitHub repository, including
106100
all the above information.

0 commit comments

Comments
 (0)