Skip to content

Commit acaf534

Browse files
authored
Merge pull request #2 from RalfJung/uninitialized-uninhabited
Update unintiialized RFC
2 parents 835f860 + 8ae636b commit acaf534

File tree

1 file changed

+60
-26
lines changed

1 file changed

+60
-26
lines changed

text/0000-uninitialized-uninhabited.md

+60-26
Original file line numberDiff line numberDiff line change
@@ -6,15 +6,17 @@
66
# Summary
77
[summary]: #summary
88

9-
Deprecate `mem::uninitialized::<T>` and replace it with a `MaybeUninit<T>` type
10-
for safer and more principled handling of uninitialized data.
9+
Deprecate `mem::uninitialized::<T>` and `mem::zeroed::<T>` and replace them with
10+
a `MaybeUninit<T>` type for safer and more principled handling of uninitialized
11+
data.
1112

1213
# Motivation
1314
[motivation]: #motivation
1415

1516
The problems with `uninitialized` centre around its usage with uninhabited
16-
types. The concept of "uninitialized data" is extremely problematic when it
17-
comes into contact with types like `!` or `Void`.
17+
types, and its interaction with Rust's type layout invariants. The concept of
18+
"uninitialized data" is extremely problematic when it comes into contact with
19+
types like `!` or `Void`.
1820

1921
For any given type, there may be valid and invalid bit-representations. For
2022
example, the type `u8` consists of a single byte and all possible bytes can be
@@ -53,6 +55,18 @@ fn mem::uninitialized::<!>() -> !
5355
Yet calling this function does not diverge! It just breaks everything then eats
5456
your laundry instead.
5557

58+
This problem is most prominent with `!` but also applies to other types that
59+
have restrictions on the values they can carry. For example,
60+
`Some(mem::uninitialized::<bool>()).is_none()` could actually return `true`
61+
because uninitialized memory could violate the invariant that a `bool` is always
62+
`[00000000]` or `[00000001]` -- and Rust relies on this invariant when doing
63+
enum layout. So, `mem::uninitialized::<bool>()` is instantaneous undefined
64+
behavior just like `mem::uninitialized::<!>()`. This also affects `mem::zeroed`
65+
when considering types where the all-`0` bit pattern is not valid, like
66+
references: `mem::zeroed::<&'static i32>()` is instantaneous undefined behavior.
67+
68+
## Tracking uninitializedness in the type
69+
5670
An alternative way of representing uninitialized data is through a union type:
5771

5872
```rust
@@ -63,14 +77,16 @@ union MaybeUninit<T> {
6377
```
6478

6579
Instead of creating an "uninitialized value", we can create a `MaybeUninit`
66-
initialized with `uninit = ()`. Then, once we know that the value in the union
80+
initialized with `uninit: ()`. Then, once we know that the value in the union
6781
is valid, we can extract it with `my_uninit.value`. This is a better way of
6882
handling uninitialized data because it doesn't involve lying to the type system
6983
and pretending that we have a value when we don't. It also better represents
7084
what's actually going on: we never *really* have a value of type `T` when we're
7185
using `uninitialized::<T>`, what we have is some memory that contains either a
7286
value (`value: T`) or nothing (`uninit: ()`), with it being the programmer's
73-
responsibility to keep track of which state we're in.
87+
responsibility to keep track of which state we're in. Notice that creating a
88+
`MaybeUninit<T>` is safe for any `T`! Only when accessing `my_uninit.value`,
89+
we have to be careful to ensure this has been properly initialized.
7490

7591
To see how this can replace `uninitialized` and fix bugs in the process,
7692
consider the following code:
@@ -143,72 +159,90 @@ library as a replacement.
143159
Add the aforementioned `MaybeUninit` type to the standard library:
144160

145161
```rust
146-
#[repr(transparent)]
147-
union MaybeUninit<T> {
162+
pub union MaybeUninit<T> {
148163
uninit: (),
149-
value: T,
164+
value: ManuallyDrop<T>,
150165
}
151166
```
152167

153168
The type should have at least the following interface
169+
([Playground link](https://play.rust-lang.org/?gist=81f5ab9a7e7107c9583de21382ef4333&version=nightly&mode=debug&edition=2015)):
154170

155171
```rust
156172
impl<T> MaybeUninit<T> {
157173
/// Create a new `MaybeUninit` in an uninitialized state.
174+
///
175+
/// Note that dropping a `MaybeUninit` will never call `T`'s drop code.
176+
/// It is your responsibility to make sure `T` gets dropped if it got initialized.
158177
pub fn uninitialized() -> MaybeUninit<T> {
159178
MaybeUninit {
160179
uninit: (),
161180
}
162181
}
163182

183+
/// Create a new `MaybeUninit` in an uninitialized state, with the memory being
184+
/// filled with `0` bytes. It depends on `T` whether that already makes for
185+
/// proper initialization. For example, `MaybeUninit<usize>::zeroed()` is initialized,
186+
/// but `MaybeUninit<&'static i32>::zeroed()` is not because references must not
187+
/// be null.
188+
///
189+
/// Note that dropping a `MaybeUninit` will never call `T`'s drop code.
190+
/// It is your responsibility to make sure `T` gets dropped if it got initialized.
191+
pub fn zeroed() -> MaybeUninit<T> {
192+
let mut u = MaybeUninit::<T>::uninitialized();
193+
unsafe { u.as_mut_ptr().write_bytes(0u8, 1); }
194+
u
195+
}
196+
164197
/// Set the value of the `MaybeUninit`. The overwrites any previous value without dropping it.
165-
pub fn set(&mut self, val: T) -> &mut T {
198+
pub fn set(&mut self, val: T) {
166199
unsafe {
167-
self.value = val;
168-
&mut self.value
200+
self.value = ManuallyDrop::new(val);
169201
}
170202
}
171203

172-
/// Take the value of the `MaybeUninit`, putting it into an uninitialized state.
204+
/// Extract the value from the `MaybeUninit` container. This is a great way
205+
/// to ensure that the data will get dropped, because the resulting `T` is
206+
/// subject to the usual drop handling.
173207
///
174208
/// # Unsafety
175209
///
176210
/// It is up to the caller to guarantee that the the `MaybeUninit` really is in an initialized
177-
/// state, otherwise undefined behaviour will result.
178-
pub unsafe fn get(&self) -> T {
179-
std::ptr::read(&self.value)
211+
/// state, otherwise this will immediately cause undefined behavior.
212+
pub unsafe fn into_inner(self) -> T {
213+
std::ptr::read(&*self.value)
180214
}
181215

182216
/// Get a reference to the contained value.
183217
///
184218
/// # Unsafety
185219
///
186220
/// It is up to the caller to guarantee that the the `MaybeUninit` really is in an initialized
187-
/// state, otherwise undefined behaviour will result.
221+
/// state, otherwise this will immediately cause undefined behavior.
188222
pub unsafe fn get_ref(&self) -> &T {
189-
&self.value
223+
&*self.value
190224
}
191225

192226
/// Get a mutable reference to the contained value.
193227
///
194228
/// # Unsafety
195229
///
196230
/// It is up to the caller to guarantee that the the `MaybeUninit` really is in an initialized
197-
/// state, otherwise undefined behaviour will result.
231+
/// state, otherwise this will immediately cause undefined behavior.
198232
pub unsafe fn get_mut(&mut self) -> &mut T {
199-
&mut self.value
233+
&mut *self.value
200234
}
201235

202-
/// Get a pointer to the contained value. This pointer will only be valid if the `MaybeUninit`
203-
/// is in an initialized state.
236+
/// Get a pointer to the contained value. Reading from this pointer will be undefined
237+
/// behavior unless the `MaybeUninit` is initialized.
204238
pub fn as_ptr(&self) -> *const T {
205-
self as *const MaybeUninit<T> as *const T
239+
unsafe { &*self.value as *const T }
206240
}
207241

208-
/// Get a mutable pointer to the contained value. This pointer will only be valid if the
209-
/// `MaybeUninit` is in an initialized state.
242+
/// Get a mutable pointer to the contained value. Reading from this pointer will be undefined
243+
/// behavior unless the `MaybeUninit` is initialized.
210244
pub fn as_mut_ptr(&mut self) -> *mut T {
211-
self as *mut MaybeUninit<T> as *mut T
245+
unsafe { &mut *self.value as *mut T }
212246
}
213247
}
214248
```

0 commit comments

Comments
 (0)