Skip to content

Commit 5b0253d

Browse files
committed
Implement a disassembler service.
It is inspired by the idea of Godbolt but with the difference of having a capability to be used on the actual projects (e.g. on a whole bunch of different source-code files) and not only on the excerpts of your code that you have copy-pasted into the Godbolt editor. This means that this disassembler service does not operate on chunks of an exemplary code but it operates on an actual binaries your code is going to produce. What it will do is that it will try to match the symbol you expressed the interest in on a source-code level, e.g. some function, against the actual symbols that are present in the resulting binary. It will do so by disassembling the binary and jumping to the symbol of interest in the resulting disassembly. Sometimes a symbol of interest won't even be present in the resulting binary because it was either optimized out or inlined or sometimes even merged with some other symbol. This normally happens because of the compiler and/or linker optimizations and it is inevitable. What can also happen is that the symbol of interest is not an actual symbol but rather a meta-symbol such as function template or a class template, and therefore cannot be present in such a form in a resulting binary. In both of these cases, disassembly service will instead offer a list which contains all of the similar symbols that we can jump to. In case of function/class templates this approach works quite well. But in case when the symbol is completely missing, we will need to fallback onto manual heuristics of finding the interesting part of the disassembly. Not having all the code isolated into a single translation-unit, like we do in a Godbolt, this service gives us an ability to examine the codegen on a project level rather than on a single source-code file level. This is a much tougher challenge for your compiler and linker to solve, and hence it may/will impact the resulting codegen. Service also provides flexibility to pick any target (binary) available to your project you might be interested in (e.g. unit-tests, google-benchmarks, etc.). Whole service is implemented natively through nm + objdump + libclang combination, and has no dependencies or whatsoever on Godbolt. It runs completely locally within the cxxd server. It also provides the convenience not having to c/p the excerpts into a Godbolt editor, which may or may not be a sensitive content, or cutting loose the project-specific dependencies so that you can successfully compile the excerpt. This is sometimes time-consuming, especially in large codebases. To understand the disassembly better, service also provides a documentation of x86-64 assembly instructions and which can be retrieved by the client application (e.g. on hovering the mouse over some assembly instruction). This part is implemented with the help of the script which Godbolt uses to scrape the documentation from https://www.felixcloutier.com/x86/index.html. I have tweaked the script to fit the purposes of cxxd better. There're some disassembly-related knobs that can be tweaked through the .cxxd_config.json configuration file. These include: (1) "targets-filter" - to filter out the uninteresting targets from your build directory. E.g. "targets-filter" : [".so", ".a", "CMake"] (2) "intermix-with-src-code" - to mix the disassembly with the source code. By default it is turned off. (3) "syntax" - to select the disassembly syntax. By default it is set to intel syntax but can be changed to AT&T if someone wants to. (4) "visualize-jumps" - to visualize the branches in disassembled code. By default this setting is turned on. (5) "binary" - Both nm and objdump custom binaries can be provided should there be a reason for it. E.g. "disassembly" : { "objdump" : { "binary" : "", "intermix-with-src-code" : false, "visualize-jumps" : true, "syntax" : "intel" }, "nm" : { "binary" : "" }, "targets-filter" : [".so", ".a", "CMake"] }
1 parent f0e48ef commit 5b0253d

File tree

7 files changed

+6572
-3
lines changed

7 files changed

+6572
-3
lines changed

README.md

Lines changed: 22 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,7 @@ JSON-compilation-database integration | :heavy_check_mark: | :heavy_check_mark:
3434
Plain-text-compilation-database integration | :heavy_check_mark: | :heavy_check_mark:
3535
Arbitrary build targets integration | :heavy_check_mark: | :heavy_check_mark:
3636
Per-repository cxxd custom configuration (JSON) | :heavy_check_mark: | :heavy_check_mark:
37+
Godbolt-like disassembler | :heavy_check_mark: | :heavy_check_mark:
3738

3839
In essence, the main idea behind it is very much alike to what [`LSP`](https://microsoft.github.io/language-server-protocol/) offers
3940
and its implementations like [`clangd`](https://clang.llvm.org/extra/clangd.html).
@@ -58,8 +59,8 @@ Optional: `clang-format`, `clang-tidy`
5859

5960
Platform | Install
6061
------------ | -------------
61-
`Fedora` | `sudo dnf install python3 clang-devel clang-libs clang-tools-extra && pip install --user clang`
62-
`Debian` | `sudo apt-get install python3 libclang-dev clang-tidy clang-format && pip install --user clang`
62+
`Fedora` | `sudo dnf install python3 clang-devel clang-libs clang-tools-extra && pip install --user clang cxxfilt`
63+
`Debian` | `sudo apt-get install python3 libclang-dev clang-tidy clang-format && pip install --user clang cxxfilt`
6364

6465
# Configuration
6566

@@ -87,6 +88,15 @@ Category | Type | Value | Description
8788
. | `args` | `clang-tidy` specific cmd-line args | Here we can provide a list of any arguments that we want to pass over to `clang-tidy` invocation. For example, to enable bugprone, cppcoreguidelines and readability `clang-tidy` checks we would do `'args' : { '-checks' : "'-*,bugprone-*,cppcoreguidelines-*,readability-*'", "-extra-arg" : "'-Wno-unknown-warning-option'", '-header-filter' : '.*' }`. We can use this list to basically pass any argument that given version of `clang-tidy` can recognize and tweak it according to the project-specific needs.
8889
`clang` | | | Here we can optionally set some of the `clang`-specific settings. Should be needed very rarely.
8990
. | `library-file` | path-to-specific-libclang-so-library | When updating your system, sometimes a new version of `libclang` can introduce bugs or changes in behavior which will result with glitches in the usage experience. Same can happen with the python bindings of `libclang`. Because `cxxd` does not have a capacity to be tested against every version of `libclang` and its python bindings, `library-file` serves the purpose to tell `cxxd` to use a certain version of `libclang`. If not provided, `cxxd` will by default use the system-wide one, which in most cases should be enough. However, if you suddenly start to experience the issues, which you have not before, this should be a first thing to check. And possibly revert the `libclang` version to an earlier one. E.g. `'library-file': '/usr/lib64/libclang.so.14.0.5'`.
91+
`disassembly` | | | Godbolt-like utility which allows you to examine the disassembly in selected executable target for a given symbol you requested it for at the source code level.
92+
. | `objdump` | |
93+
. | `binary` | path-to-specific-objdump-binary | Usually system-wide installed `objdump` will match the needs but if not, this field can be used to pin to particular version of `objdump`. E.g. `'binary': '/opt/clang+llvm-8.0.0-x86_64-linux-gnu/bin/objdump'`
94+
. | `intermix-with-src-code` | true or false | Default is false. If you want to see source-code mixed within the disassembled binary set this field to true.
95+
. | `visualize-jumps` | true or false | Default is true. If you want to disable ASCII art which visualizes jumps within the disassembled binary set this field to false.
96+
. | `syntax` | `intel` or `att` | Default is `intel`. Syntax used for instructions in a disassembled binary.
97+
. | `nm` | |
98+
. | `binary` | path-to-specific-nm-binary | Usually system-wide installed `nm` will match the needs but if not, this field can be used to pin to particular version of `nm`. E.g. `'binary': '/opt/clang+llvm-8.0.0-x86_64-linux-gnu/bin/nm'`
99+
. | `targets-filter` | A list of extensions or directories | Prior to disassembly, a target binary needs to be selected. This filter can be used to remove targets that are not interesting. E.g. `[".so", ".a", "CMake"]`
90100

91101
## Compilation database
92102

@@ -189,6 +199,16 @@ Some parts of the MySQL used non-standard file extensions for the source code su
189199
"cmd" : "cmake ../../../5.x -GNinja -DCMAKE_BUILD_TYPE=RelWithDebInfo -DDOWNLOAD_BOOST=ON -DWITH_BOOST=../ -DWITH_UNIT_TESTS=ON -DCMAKE_EXPORT_COMPILE_COMMANDS=ON -DCMAKE_CXX_COMPILER_LAUNCHER=ccache && ninja"
190200
},
191201
}
202+
},
203+
"disassembly" : {
204+
"objdump" : {
205+
"intermix-with-src-code" : false,
206+
"visualize-jumps" : true,
207+
"syntax" : "intel"
208+
},
209+
"nm" : {
210+
},
211+
"targets-filter" : [".so", ".a", "CMake"]
192212
}
193213
}
194214
```

api.py

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44
from . services.code_completion.code_completion import CodeCompletionRequestId
55
from . services.source_code_model.indexer.clang_indexer import SourceCodeModelIndexerRequestId
66
from . services.project_builder_service import ProjectBuilderRequestId
7+
from . services.disassembly_service import DisassemblyRequestId
78

89
#
910
# Server API
@@ -180,6 +181,26 @@ def clang_tidy_stop(handle, subscribe_for_callback):
180181
def clang_tidy_request(handle, filename, apply_fixes):
181182
_server_request_service(handle, ServiceId.CLANG_TIDY, filename, apply_fixes)
182183

184+
#
185+
# Disassembly service API
186+
#
187+
def disassembly_start(handle):
188+
_server_start_service(handle, ServiceId.DISASSEMBLY)
189+
190+
def disassembly_stop(handle, subscribe_for_callback):
191+
_server_stop_service(handle, ServiceId.DISASSEMBLY, subscribe_for_callback)
192+
193+
def disassembly_list_targets(handle):
194+
_server_request_service(handle, ServiceId.DISASSEMBLY, DisassemblyRequestId.LIST_TARGETS)
195+
196+
def disassembly_list_symbol_candidates(handle, target, filename, line, column):
197+
_server_request_service(handle, ServiceId.DISASSEMBLY, DisassemblyRequestId.LIST_SYMBOL_CANDIDATES, target, filename, line, column)
198+
199+
def disassembly_run(handle, target, list_symbol_candidate_index):
200+
_server_request_service(handle, ServiceId.DISASSEMBLY, DisassemblyRequestId.DISASSEMBLE, target, list_symbol_candidate_index)
201+
202+
def disassembly_asm_doc(handle, asm_instruction):
203+
_server_request_service(handle, ServiceId.DISASSEMBLY, DisassemblyRequestId.ASM_INSTRUCTION_INFO, asm_instruction)
183204

184205
#
185206
# Helper functions.

0 commit comments

Comments
 (0)