naftuli.wtf An urban mystic, pining for conifers in a jungle of concrete and steel.

Rust: The Hard Parts - Part One

Rust has a perception of being a very difficult language to learn. I had a similar experience, but just as I was told, there is a point where things start to get a lot easier. This post aims to describe the hard parts that I had to get through in order to start being productive with Rust in the belief that this may help others get over the hill to that sweet spot of infinite bliss and productivity.

"Nothing's ever easy, is it?"

In this post, I’m going to cover references and borrowing.

Future posts will cover lifetimes, sized and unsized types, and thread safety with Send and Sync.

I’ll be providing code examples using the Rust Playground.

NOTE: I’ll often be prefixing variable names with an underscore, this is done to suppress compiler warnings about unused variables. If I refer to a variable b, this can mean either b or _b. Hopefully this isn’t confusing, but I’m kind of pedantic about compiler warnings :stuck_out_tongue_closed_eyes:

References and Borrowing

One of the most notoriously difficult parts about Rust, and indeed one of its selling points, is enforcement of certain rules at compile time relating to how references work and what is allowed and what isn’t. I’m explicitly disambiguating this section from lifetimes, which will be covered in a future post. Additionally, I’m going to take some liberty in using words that may have a different definition than what Rust and the community use; this is done with the intention to make concepts easier to understand.

Owned Values

Generally, there are around four different types of references/variable bindings that Rust provides:

  • let a = MyType{}; - the a variable binding is said to own the value that it was assigned to in an immutable way, which means that a cannot be mutated.
  • let mut a = MyType{}; - the a variable binding to an owned value, with the binding itself being mutable.

These bindings are owned, meaning that they are not references in the way that will be described next. a in these examples exclusively owns what it is bound to, which is a new struct called MyType. Any binding that is owned can be moved as opposed to being borrowed.

Mutable bindings allow both reassignment and field modification. These will be demonstrated below.

With the exception of certain pointer types not described here, there can only be one true owner of a value, whereas there can be many references to that value. It’s slightly more complicated than this, and we’ll deal with the complexity of this shortly.

Unowned, Borrowed Values, AKA References

As opposed to what was defined above as owned values, there are also references to values, which have a different set of rules than owned values above:

  • let b = &a; - the b variable is a non-owned and immutable reference to a.
  • let b = &mut a; - the b variable is a non-owned and mutable reference to a. The a variable binding must be a mut binding in order to borrow a mutably.

References have an additional rule that must be obeyed:

While an owned value pointed to by a exists, there can either be:

  • Many immutable references to a.
  • One mutable reference to a.

If you have active immutable references to a, you cannot have a mutable reference to a.

If you have an active mutable reference to a, you cannot have immutable references to a.

Rust often calls these references “borrowed” values, because ownership does not change when using references: the values are borrowed and returned when they’re done being used.

This can be confusing, so let’s write some actual code. I’ll label things as per those that compile and those that don’t. Having described owned values and unowned borrowed references to values, we’ll start with references and then move onto owned values.

NOTE: An important note about references: they can not outlive what they refer to. Rust simply won’t allow it. In the next post about lifetimes, I will elaborate about this.

Immutable References

Let’s start with immutable references to values:

Example 00: COMPILES :heavy_check_mark:

struct MyType;

fn main() {
    // `a` _owns_ the value
    let a = MyType{};
    // `_b` and `_c` are immutable _references_ to what `a` contains
    let _b = &a;
    let _c = &a;
}

Simply put: b and c are references to the value that a owns.

Example 01: COMPILES :heavy_check_mark:

struct MyType;

fn main() {
    // `a` _owns_ the value
    let a = MyType{};
    // `b` is an immutable _reference_ to what `a` contains
    let b = &a;
    // `c` is an immutable _reference_ to what `b` contains, which is a reference to `a`
    let _c = &b;
}

Note that c is simply a reference to a via b: its actual type is &&MyType, but some magic occurs behind the scenes to make it act like a regular &MyType.

Mutable References

Next, we’ll create a mutable reference to a value:

Example 02: COMPILES :heavy_check_mark:

struct MyType;

fn main() {
    // `a` is a mutable, owned value
    let mut a = MyType{};
    // `b` is a mutable _reference_ to what `a` contains
    let _b = &mut a;
}

Therefore, b is a mutable reference to what a contains.

Just as we described above, there can either be one mutable reference to a value or many immutable references to a value.

Mutable and Immutable References: Fire and Gasoline

Next, let’s break things.

Example 03: DOES NOT COMPILE :x:

struct MyType;

fn e(_: &mut MyType) {}
fn f(_: &MyType) {}

fn main() {
    let mut a = MyType{};
    let b = &mut a;
    let c = &a;
    // rustc has gotten smarter so we need to do this to force the compiler to fail
    // call `e` with a mutable reference to `a`, i.e. `b`
    e(b);
    // call `f` with an immutable reference to `a`, i.e. `c`
    f(c);
}

Can you spot why this doesn’t compile? It doesn’t compile because both a mutable reference to a, which is b, and an immutable reference to a, which is c, both exist at the same time. Try removing one or the other to see it compile. There are obviously two different ways to get it to compile:

  • Remove the definition and use of b.
  • Remove the definition and use of c.

NOTE: Rust has very good reasons for ownership/borrowing rules, namely preventing a number of memory safety and concurrency bugs. I’m not going to describe why Rust does this, as there are plenty of articles out there describing the why and the what.

Okay, now we’ve covered the rules and seen some code, so let’s talk about moving and copying values.

Moving Owned Values

References are easy to acquire and easy to pass around, provided that you follow the rules set out above, but we now need to revisit what owned values are. We used a simplistic definition above which is nevertheless still valid:

There can be only one true owner of a value¹, and zero or more references to that value.²

  • ¹: Usually.
  • ²: Adherent to the rules spelled out above.

So far, we’ve only seen passing references around, but we haven’t yet passed actual values around. Let’s do that now.

Example 04: COMPILES :heavy_check_mark:

struct MyType;

fn main() {
    let a = MyType{};
    // move `a` into `b`; henceforth, only `b` owns the value and `a` is "destroyed"
    let _b = a;
}

This code moves the value that a owned into b. There are no references involved here. If we try to use a after moving it into b, compilation will fail:

Example 05: DOES NOT COMPILE :x:

struct MyType;

fn f(_: &MyType) {}

fn main() {
    let a = MyType{};
    // move `a`'s value into `b`
    let _b = a;
    // pass a reference to `a` into the `f` function
    f(&a);
}

This does not compile because a, as it were, was “destroyed” by moving its value into b.

Copying Values

There is, however, an exception to this. If a type implements std::marker::Copy, when a move would normally occur, the value is instead copied rather than moved:

Example 06: COMPILES :heavy_check_mark:

fn f(_: &u64) {}

fn main() {
    // u64 implements Copy
    let a = 0u64;
    // since u64 implements Copy, it isn't moved here, instead its value is copied into the new binding
    let _b = a;
    // what failed to compile above compiles without issue here because of copying
    f(&a);
}

As the code above illustrates, the previous example that did not compile for MyType does in fact compile for u64 because u64 implements std::marker::Copy. Most primitive types in Rust, like integers and floats, are Copy because the cost of copying such values is extremely low or nonexistent.

The docs describe Copy types as:

Types whose values can be duplicated simply by copying bits.

Modifying Fields and Values

Mutability in Rust generally means that you can:

  1. Modify fields of a struct.
  2. Change a variable binding to point to something else.

Let’s demonstrate both in the next example:

Example 07: COMPILES :heavy_check_mark:

struct MyType {
    value: u32
}

fn modify_field(i: &mut MyType) {
    i.value = 2;
}

fn reassign(i: &mut MyType) {
    *i = MyType { value: 1337 };
}

fn main() {
    // create a mutably owned struct
    let mut a = MyType { value: 1 };
    // pass a mutable reference to modify_field; can only create &mut from a mutable value
    modify_field(&mut a);

    // create a simple integer
    let mut b = MyType { value: 2 };
    // modify b in-place
    reassign(&mut b);
}

First, we use modify_field, which takes a mutable reference to a MyType struct, to modify the value field of that struct. Next, we use reassign, which again takes a mutable reference to a MyType struct, to change the entire value that b is bound to.

Supercut :sunglasses:

We have now covered everything that we can without getting into lifetimes:

  1. Owned Values with Immutable and Mutable Bindings
  2. Immutable References
  3. Mutable References
  4. Moving Values
  5. Copying Values
  6. Modifying Fields and Values via Mutable References

Let’s now just run through two examples demonstrating all of the above.

Example 08: COMPILES :heavy_check_mark:

struct MyType;

fn mv_immutable(_a: MyType) {
 // within here, `_a` is an immutable value by default
}

fn mv_mutable(mut _a: MyType) {
 // within here, `_a` is a mutable value
}

fn ref_immutable(_: &MyType) {}
fn ref_mutable(_: &mut MyType) {}

fn main() {
    // create owned value and store in `a`
    let a = MyType{};
    // destroy `a` by moving its value into the `mv_immutable` function
    mv_immutable(a);

    // create owned value and store in `b`
    let b = MyType{};
    // destroy `b` by moving its value into the `mv_mutable` function
    mv_mutable(b);

    // create owned value and store in `c`
    let c = MyType{};
    // pass an immutable reference to `c` to the `ref_immutable` function, allowing it to borrow `c`
    ref_immutable(&c);

    // create owned value and store in `d`
    let mut d = MyType{};
    // pass a mutable reference to `d` to the `ref_mutable` function, allowing it to mutably borrow `d`
    ref_mutable(&mut d);
}

You can break this by attempting to access a or b after calling mv_immutable or mv_mutable, as these functions move the values into the functions, destroying the previous variable binding.

It’s important to keep our definitions of owned values and references to owned values separate, as seen in this example:

struct MyType;

fn main() {
    let a = MyType;
    let mut a = MyType;
    let _b = &mut a;
}

At first, a is an owned value with an immutable binding, then becomes an owned value with a mutable binding, and finally b is created as a mutable reference to a. We see above that an immutable binding can be upgraded into a mutable binding. With references and borrows, however, it is not possible to upgrade an immutable reference into a mutable reference, at least not safely.

Let’s finish up this section by seeing the same thing with struct methods:

Example 09: COMPILES :heavy_check_mark:

struct MyType;

impl MyType {
    fn mv_immutable(self) {}
    fn mv_mutable(mut self) {}
    fn ref_immutable(&self) {}
    fn ref_mutable(&mut self) {}
}

fn main() {
    // create owned value and store in `a`
    let a = MyType{};
    // destroy `a` by moving its value into its `mv_immutable` function
    a.mv_immutable();

    // create owned value and store it in `b`
    let b = MyType{};
    // destroy `b` by moving its value mutably into the `mv_mutable` function
    // note that `b` isn't `mut`, it doesn't need to be; since there can only be one owner of a value,
    // we can make the value mutable as it is moved
    b.mv_mutable();

    // create owned value and store in `c`
    let c = MyType{};
    // pass an immutable reference to the its `ref_immutable` function
    c.ref_immutable();

    // create owned value and store it in `d`
    let mut d = MyType{};
    // pass a mutable reference to its `ref_mutable` function
    d.ref_mutable();
}

As above, try violating some of the rules:

  • Using a or b after calling mv_immutable or mv_mutable will break compilation, as they’ve been destroyed.
  • If d is not declared mut, it will be impossible to call d.ref_mutable().

Alright! We’ve seen ownership, references, mutability, moving, and copying. Play around with the examples on the playground to see how and why compilation breaks, and especially take note of how awesome rustc is in telling you exactly why things don’t work.

Conclusion

I was originally planning on cramming a lot more into this post, but I decided to split things up because there’s a LOT of ground to cover.

In summary, here’s what you need to know:

  • A value can (usually) only have one owner, not zero, and not more than one.
  • Variable bindings are either mutable or immutable.
  • Moving a value to another owner destroys the original owner.
  • Moving a value to another owner destroys all previous references to that value.
  • If a type is std::marker::Copy, its value is copied in what would normally be a move.
  • A reference, or a “borrow,” of a value can be mutable or immutable.
  • There can either be exactly one mutable reference to a value or many immutable references. Never both at the same time.
  • A reference to a value can not outlive the value that it refers to: Rust will not allow you to do this.

Next in the series, I will cover lifetimes, sized and unsized types, and thread safety with Send and Sync.