Skip to content

Commit 580f301

Browse files
authored
Merge pull request rust-lang#23 from theotherphil/hir
Copy contents of README.md from librustc/hir
2 parents 67da39e + 11bb542 commit 580f301

File tree

1 file changed

+118
-0
lines changed

1 file changed

+118
-0
lines changed

src/hir-lowering.md

+118
Original file line numberDiff line numberDiff line change
@@ -1 +1,119 @@
11
# HIR lowering
2+
3+
The HIR -- "High-level IR" -- is the primary IR used in most of
4+
rustc. It is a desugared version of the "abstract syntax tree" (AST)
5+
that is generated after parsing, macro expansion, and name resolution
6+
have completed. Many parts of HIR resemble Rust surface syntax quite
7+
closely, with the exception that some of Rust's expression forms have
8+
been desugared away (as an example, `for` loops are converted into a
9+
`loop` and do not appear in the HIR).
10+
11+
This chapter covers the main concepts of the HIR.
12+
13+
### Out-of-band storage and the `Crate` type
14+
15+
The top-level data-structure in the HIR is the `Crate`, which stores
16+
the contents of the crate currently being compiled (we only ever
17+
construct HIR for the current crate). Whereas in the AST the crate
18+
data structure basically just contains the root module, the HIR
19+
`Crate` structure contains a number of maps and other things that
20+
serve to organize the content of the crate for easier access.
21+
22+
For example, the contents of individual items (e.g., modules,
23+
functions, traits, impls, etc) in the HIR are not immediately
24+
accessible in the parents. So, for example, if had a module item `foo`
25+
containing a function `bar()`:
26+
27+
```
28+
mod foo {
29+
fn bar() { }
30+
}
31+
```
32+
33+
Then in the HIR the representation of module `foo` (the `Mod`
34+
stuct) would have only the **`ItemId`** `I` of `bar()`. To get the
35+
details of the function `bar()`, we would lookup `I` in the
36+
`items` map.
37+
38+
One nice result from this representation is that one can iterate
39+
over all items in the crate by iterating over the key-value pairs
40+
in these maps (without the need to trawl through the IR in total).
41+
There are similar maps for things like trait items and impl items,
42+
as well as "bodies" (explained below).
43+
44+
The other reason to setup the representation this way is for better
45+
integration with incremental compilation. This way, if you gain access
46+
to a `&hir::Item` (e.g. for the mod `foo`), you do not immediately
47+
gain access to the contents of the function `bar()`. Instead, you only
48+
gain access to the **id** for `bar()`, and you must invoke some
49+
function to lookup the contents of `bar()` given its id; this gives us
50+
a chance to observe that you accessed the data for `bar()` and record
51+
the dependency.
52+
53+
### Identifiers in the HIR
54+
55+
Most of the code that has to deal with things in HIR tends not to
56+
carry around references into the HIR, but rather to carry around
57+
*identifier numbers* (or just "ids"). Right now, you will find four
58+
sorts of identifiers in active use:
59+
60+
- `DefId`, which primarily names "definitions" or top-level items.
61+
- You can think of a `DefId` as being shorthand for a very explicit
62+
and complete path, like `std::collections::HashMap`. However,
63+
these paths are able to name things that are not nameable in
64+
normal Rust (e.g., impls), and they also include extra information
65+
about the crate (such as its version number, as two versions of
66+
the same crate can co-exist).
67+
- A `DefId` really consists of two parts, a `CrateNum` (which
68+
identifies the crate) and a `DefIndex` (which indixes into a list
69+
of items that is maintained per crate).
70+
- `HirId`, which combines the index of a particular item with an
71+
offset within that item.
72+
- the key point of a `HirId` is that it is *relative* to some item (which is named
73+
via a `DefId`).
74+
- `BodyId`, this is an absolute identifier that refers to a specific
75+
body (definition of a function or constant) in the crate. It is currently
76+
effectively a "newtype'd" `NodeId`.
77+
- `NodeId`, which is an absolute id that identifies a single node in the HIR tree.
78+
- While these are still in common use, **they are being slowly phased out**.
79+
- Since they are absolute within the crate, adding a new node
80+
anywhere in the tree causes the node-ids of all subsequent code in
81+
the crate to change. This is terrible for incremental compilation,
82+
as you can perhaps imagine.
83+
84+
### HIR Map
85+
86+
Most of the time when you are working with the HIR, you will do so via
87+
the **HIR Map**, accessible in the tcx via `tcx.hir` (and defined in
88+
the `hir::map` module). The HIR map contains a number of methods to
89+
convert between ids of various kinds and to lookup data associated
90+
with a HIR node.
91+
92+
For example, if you have a `DefId`, and you would like to convert it
93+
to a `NodeId`, you can use `tcx.hir.as_local_node_id(def_id)`. This
94+
returns an `Option<NodeId>` -- this will be `None` if the def-id
95+
refers to something outside of the current crate (since then it has no
96+
HIR node), but otherwise returns `Some(n)` where `n` is the node-id of
97+
the definition.
98+
99+
Similarly, you can use `tcx.hir.find(n)` to lookup the node for a
100+
`NodeId`. This returns a `Option<Node<'tcx>>`, where `Node` is an enum
101+
defined in the map; by matching on this you can find out what sort of
102+
node the node-id referred to and also get a pointer to the data
103+
itself. Often, you know what sort of node `n` is -- e.g., if you know
104+
that `n` must be some HIR expression, you can do
105+
`tcx.hir.expect_expr(n)`, which will extract and return the
106+
`&hir::Expr`, panicking if `n` is not in fact an expression.
107+
108+
Finally, you can use the HIR map to find the parents of nodes, via
109+
calls like `tcx.hir.get_parent_node(n)`.
110+
111+
### HIR Bodies
112+
113+
A **body** represents some kind of executable code, such as the body
114+
of a function/closure or the definition of a constant. Bodies are
115+
associated with an **owner**, which is typically some kind of item
116+
(e.g., a `fn()` or `const`), but could also be a closure expression
117+
(e.g., `|x, y| x + y`). You can use the HIR map to find the body
118+
associated with a given def-id (`maybe_body_owned_by()`) or to find
119+
the owner of a body (`body_owner_def_id()`).

0 commit comments

Comments
 (0)