Skip to content

Commit

Permalink
Add implementation-details segment
Browse files Browse the repository at this point in the history
  • Loading branch information
djmitche committed Oct 1, 2024
1 parent 0a9a01d commit 0ffb3ad
Show file tree
Hide file tree
Showing 7 changed files with 75 additions and 51 deletions.
4 changes: 4 additions & 0 deletions src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -162,6 +162,10 @@
- [Struct Lifetimes](lifetimes/struct-lifetimes.md)
- [Exercise: Protobuf Parsing](lifetimes/exercise.md)
- [Solution](lifetimes/solution.md)
- [Implementation Details](implementation-details.md)
- [Niche Optimization](implementation-details/niche-optimization.md)
- [Exercise: TBD](implementation-details/exercise.md)
- [Solution](implementation-details/solution.md)

---

Expand Down
3 changes: 3 additions & 0 deletions src/implementation-details.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Implementation Details

{{%segment outline}}
5 changes: 5 additions & 0 deletions src/implementation-details/exercise.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Exercise: TBD

NOTES:

- maybe a good place for a linked list?
57 changes: 57 additions & 0 deletions src/implementation-details/niche-optimization.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
---
minutes: 10
---

# Niche Optimization

For some types, there are in-memory bit patterns that do not represent a valid
value. For example, `bool` can only be 0 or 1, and references are represented as
non-NULL pointers. Rust uses this observation to store enums without a distinct
discriminant field, saving space.

```rust,editable
#![allow(dead_code)]
use std::{mem::size_of_val, slice::from_raw_parts};
enum TriState {
Set(bool),
Unset,
}
fn show<T: Sized>(name: &str, value: T) {
let bytes = unsafe {
from_raw_parts(&value as *const T as *const u8, size_of_val(&value))
}
.iter()
.map(|b| format!("{:02x}", b))
.collect::<Vec<_>>()
.join("");
println!("{}: {} = {}", name, std::any::type_name::<T>(), bytes);
}
fn main() {
show("false", TriState::Set(false));
show("true", TriState::Set(true));
show("unset", TriState::Unset);
}
```

<details>

The example shows Rust choosing a non-boolean value for the `Unset` variant.

Try showing:

- `&x` for some x
- `Some(&x)`
- `None::<&u32>`
- `Some(Some(&x))`
- `std::num::NonZero::new(10)`

Null pointer optimization: For
[some types](https://doc.rust-lang.org/std/option/#representation), Rust
guarantees that `size_of::<T>()` equals `size_of::<Option<T>>()` and that the
all-zeroes pattern transmutes to `None`. This is a special-case of the niche
optimization for `Option`.

</details>
1 change: 1 addition & 0 deletions src/implementation-details/solution.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# Solution
4 changes: 2 additions & 2 deletions src/std-types/option.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ fn main() {
None.
- It's common to `unwrap`/`expect` all over the place when hacking something
together, but production code typically handles `None` in a nicer fashion.
- The niche optimization means that `Option<T>` often has the same size in
memory as `T`.
- The [niche optimization](../implementation-details/niche-optimization.md)
means that `Option<T>` often has the same size in memory as `T`.

</details>
52 changes: 3 additions & 49 deletions src/user-defined-types/enums.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,9 +46,9 @@ Key Points:
- Rust uses minimal space to store the discriminant.
- If necessary, it stores an integer of the smallest required size
- If the allowed variant values do not cover all bit patterns, it will use
invalid bit patterns to encode the discriminant (the "niche optimization").
For example, `Option<&u8>` stores either a pointer to an integer or `NULL`
for the `None` variant.
invalid bit patterns to encode the discriminant (the
"[niche optimization](../implementation-details/niche-optimization.md)",
discussed on day 3).
- You can control the discriminant if needed (e.g., for compatibility with C):

<!-- mdbook-xgettext: skip -->
Expand All @@ -70,50 +70,4 @@ Key Points:
Without `repr`, the discriminant type takes 2 bytes, because 10001 fits 2
bytes.

## More to Explore

Rust has several optimizations it can employ to make enums take up less space.

- Null pointer optimization: For
[some types](https://doc.rust-lang.org/std/option/#representation), Rust
guarantees that `size_of::<T>()` equals `size_of::<Option<T>>()`.

Example code if you want to show how the bitwise representation _may_ look
like in practice. It's important to note that the compiler provides no
guarantees regarding this representation, therefore this is totally unsafe.

<!-- mdbook-xgettext: skip -->
```rust,editable
use std::mem::transmute;
macro_rules! dbg_bits {
($e:expr, $bit_type:ty) => {
println!("- {}: {:#x}", stringify!($e), transmute::<_, $bit_type>($e));
};
}
fn main() {
unsafe {
println!("bool:");
dbg_bits!(false, u8);
dbg_bits!(true, u8);
println!("Option<bool>:");
dbg_bits!(None::<bool>, u8);
dbg_bits!(Some(false), u8);
dbg_bits!(Some(true), u8);
println!("Option<Option<bool>>:");
dbg_bits!(Some(Some(false)), u8);
dbg_bits!(Some(Some(true)), u8);
dbg_bits!(Some(None::<bool>), u8);
dbg_bits!(None::<Option<bool>>, u8);
println!("Option<&i32>:");
dbg_bits!(None::<&i32>, usize);
dbg_bits!(Some(&0i32), usize);
}
}
```

</details>

0 comments on commit 0ffb3ad

Please sign in to comment.