Rust: The Hard Parts - Part One
March 20, 2019Rust has a perception of being a very difficult language to learn. I had a similar experience, but just as I was told, there is a point where things start to get a lot easier. This post aims to describe the hard parts that I had to get through in order to start being productive with Rust in the belief that this may help others get over the hill to that sweet spot of infinite bliss and productivity.
In this post, I’m going to cover references and borrowing.
Future posts will cover lifetimes, sized and unsized types, and thread safety with Send
and Sync
.
I’ll be providing code examples using the Rust Playground.
NOTE: I’ll often be prefixing variable names with an underscore, this is done to suppress compiler warnings about unused variables. If I refer to a variable
b
, this can mean eitherb
or_b
. Hopefully this isn’t confusing, but I’m kind of pedantic about compiler warnings
References and Borrowing
One of the most notoriously difficult parts about Rust, and indeed one of its selling points, is enforcement of certain rules at compile time relating to how references work and what is allowed and what isn’t. I’m explicitly disambiguating this section from lifetimes, which will be covered in a future post. Additionally, I’m going to take some liberty in using words that may have a different definition than what Rust and the community use; this is done with the intention to make concepts easier to understand.
Owned Values
Generally, there are around four different types of references/variable bindings that Rust provides:
-
let a = MyType{};
- thea
variable binding is said to own the value that it was assigned to in an immutable way, which means thata
cannot be mutated. -
let mut a = MyType{};
- thea
variable binding to an owned value, with the binding itself being mutable.
These bindings are owned, meaning that they are not references in the way that will be described next. a
in these
examples exclusively owns what it is bound to, which is a new struct called MyType
. Any binding that is owned can
be moved as opposed to being borrowed.
Mutable bindings allow both reassignment and field modification. These will be demonstrated below.
With the exception of certain pointer types not described here, there can only be one true owner of a value, whereas there can be many references to that value. It’s slightly more complicated than this, and we’ll deal with the complexity of this shortly.
Unowned, Borrowed Values, AKA References
As opposed to what was defined above as owned values, there are also references to values, which have a different set of rules than owned values above:
-
let b = &a;
- theb
variable is a non-owned and immutable reference toa
. -
let b = &mut a;
- theb
variable is a non-owned and mutable reference toa
. Thea
variable binding must be amut
binding in order to borrowa
mutably.
References have an additional rule that must be obeyed:
While an owned value pointed to by
a
exists, there can either be:
- Many immutable references to
a
.- One mutable reference to
a
.If you have active immutable references to
a
, you cannot have a mutable reference toa
.If you have an active mutable reference to
a
, you cannot have immutable references toa
.
Rust often calls these references “borrowed” values, because ownership does not change when using references: the values are borrowed and returned when they’re done being used.
This can be confusing, so let’s write some actual code. I’ll label things as per those that compile and those that don’t. Having described owned values and unowned borrowed references to values, we’ll start with references and then move onto owned values.
NOTE: An important note about references: they can not outlive what they refer to. Rust simply won’t allow it. In the next post about lifetimes, I will elaborate about this.
Immutable References
Let’s start with immutable references to values:
struct MyType;
fn main() {
// `a` _owns_ the value
let a = MyType{};
// `_b` and `_c` are immutable _references_ to what `a` contains
let _b = &a;
let _c = &a;
}
Simply put: b
and c
are references to the value that a
owns.
struct MyType;
fn main() {
// `a` _owns_ the value
let a = MyType{};
// `b` is an immutable _reference_ to what `a` contains
let b = &a;
// `c` is an immutable _reference_ to what `b` contains, which is a reference to `a`
let _c = &b;
}
Note that c
is simply a reference to a
via b
: its actual type is &&MyType
, but some magic occurs
behind the scenes to make it act like a regular &MyType
.
Mutable References
Next, we’ll create a mutable reference to a value:
struct MyType;
fn main() {
// `a` is a mutable, owned value
let mut a = MyType{};
// `b` is a mutable _reference_ to what `a` contains
let _b = &mut a;
}
Therefore, b
is a mutable reference to what a
contains.
Just as we described above, there can either be one mutable reference to a value or many immutable references to a value.
Mutable and Immutable References: Fire and Gasoline
Next, let’s break things.
struct MyType;
fn e(_: &mut MyType) {}
fn f(_: &MyType) {}
fn main() {
let mut a = MyType{};
let b = &mut a;
let c = &a;
// rustc has gotten smarter so we need to do this to force the compiler to fail
// call `e` with a mutable reference to `a`, i.e. `b`
e(b);
// call `f` with an immutable reference to `a`, i.e. `c`
f(c);
}
Can you spot why this doesn’t compile? It doesn’t compile because both a mutable reference to a
, which is b
, and an
immutable reference to a
, which is c
, both exist at the same time. Try removing one or the other to see it compile.
There are obviously two different ways to get it to compile:
- Remove the definition and use of
b
. - Remove the definition and use of
c
.
NOTE: Rust has very good reasons for ownership/borrowing rules, namely preventing a number of memory safety and concurrency bugs. I’m not going to describe why Rust does this, as there are plenty of articles out there describing the why and the what.
Okay, now we’ve covered the rules and seen some code, so let’s talk about moving and copying values.
Moving Owned Values
References are easy to acquire and easy to pass around, provided that you follow the rules set out above, but we now need to revisit what owned values are. We used a simplistic definition above which is nevertheless still valid:
There can be only one true owner of a value¹, and zero or more references to that value.²
- ¹: Usually.
- ²: Adherent to the rules spelled out above.
So far, we’ve only seen passing references around, but we haven’t yet passed actual values around. Let’s do that now.
struct MyType;
fn main() {
let a = MyType{};
// move `a` into `b`; henceforth, only `b` owns the value and `a` is "destroyed"
let _b = a;
}
This code moves the value that a
owned into b
. There are no references involved here. If we try to use a
after
moving it into b
, compilation will fail:
struct MyType;
fn f(_: &MyType) {}
fn main() {
let a = MyType{};
// move `a`'s value into `b`
let _b = a;
// pass a reference to `a` into the `f` function
f(&a);
}
This does not compile because a
, as it were, was “destroyed” by moving its value into b
.
Copying Values
There is, however, an exception to this. If a type implements std::marker::Copy
, when a move
would normally occur, the value is instead copied rather than moved:
fn f(_: &u64) {}
fn main() {
// u64 implements Copy
let a = 0u64;
// since u64 implements Copy, it isn't moved here, instead its value is copied into the new binding
let _b = a;
// what failed to compile above compiles without issue here because of copying
f(&a);
}
As the code above illustrates, the previous example that did not compile for MyType
does in fact compile for u64
because u64
implements std::marker::Copy
. Most primitive types in Rust, like integers and floats, are Copy
because the cost of copying such values is extremely low or nonexistent.
The docs describe Copy
types as:
Types whose values can be duplicated simply by copying bits.
Modifying Fields and Values
Mutability in Rust generally means that you can:
- Modify fields of a struct.
- Change a variable binding to point to something else.
Let’s demonstrate both in the next example:
struct MyType {
value: u32
}
fn modify_field(i: &mut MyType) {
i.value = 2;
}
fn reassign(i: &mut MyType) {
*i = MyType { value: 1337 };
}
fn main() {
// create a mutably owned struct
let mut a = MyType { value: 1 };
// pass a mutable reference to modify_field; can only create &mut from a mutable value
modify_field(&mut a);
// create a simple integer
let mut b = MyType { value: 2 };
// modify b in-place
reassign(&mut b);
}
First, we use modify_field
, which takes a mutable reference to a MyType
struct, to modify the value
field of that
struct. Next, we use reassign
, which again takes a mutable reference to a MyType
struct, to change the entire
value that b
is bound to.
Supercut
We have now covered everything that we can without getting into lifetimes:
- Owned Values with Immutable and Mutable Bindings
- Immutable References
- Mutable References
- Moving Values
- Copying Values
- Modifying Fields and Values via Mutable References
Let’s now just run through two examples demonstrating all of the above.
struct MyType;
fn mv_immutable(_a: MyType) {
// within here, `_a` is an immutable value by default
}
fn mv_mutable(mut _a: MyType) {
// within here, `_a` is a mutable value
}
fn ref_immutable(_: &MyType) {}
fn ref_mutable(_: &mut MyType) {}
fn main() {
// create owned value and store in `a`
let a = MyType{};
// destroy `a` by moving its value into the `mv_immutable` function
mv_immutable(a);
// create owned value and store in `b`
let b = MyType{};
// destroy `b` by moving its value into the `mv_mutable` function
mv_mutable(b);
// create owned value and store in `c`
let c = MyType{};
// pass an immutable reference to `c` to the `ref_immutable` function, allowing it to borrow `c`
ref_immutable(&c);
// create owned value and store in `d`
let mut d = MyType{};
// pass a mutable reference to `d` to the `ref_mutable` function, allowing it to mutably borrow `d`
ref_mutable(&mut d);
}
You can break this by attempting to access a
or b
after calling mv_immutable
or mv_mutable
, as these functions
move the values into the functions, destroying the previous variable binding.
It’s important to keep our definitions of owned values and references to owned values separate, as seen in this example:
struct MyType;
fn main() {
let a = MyType;
let mut a = MyType;
let _b = &mut a;
}
At first, a
is an owned value with an immutable binding, then becomes an owned value with a mutable binding, and
finally b
is created as a mutable reference to a
. We see above that an immutable binding can be upgraded into
a mutable binding. With references and borrows, however, it is not possible to upgrade an immutable reference into
a mutable reference, at least not safely.
Let’s finish up this section by seeing the same thing with struct methods:
struct MyType;
impl MyType {
fn mv_immutable(self) {}
fn mv_mutable(mut self) {}
fn ref_immutable(&self) {}
fn ref_mutable(&mut self) {}
}
fn main() {
// create owned value and store in `a`
let a = MyType{};
// destroy `a` by moving its value into its `mv_immutable` function
a.mv_immutable();
// create owned value and store it in `b`
let b = MyType{};
// destroy `b` by moving its value mutably into the `mv_mutable` function
// note that `b` isn't `mut`, it doesn't need to be; since there can only be one owner of a value,
// we can make the value mutable as it is moved
b.mv_mutable();
// create owned value and store in `c`
let c = MyType{};
// pass an immutable reference to the its `ref_immutable` function
c.ref_immutable();
// create owned value and store it in `d`
let mut d = MyType{};
// pass a mutable reference to its `ref_mutable` function
d.ref_mutable();
}
As above, try violating some of the rules:
- Using
a
orb
after callingmv_immutable
ormv_mutable
will break compilation, as they’ve been destroyed. - If
d
is not declaredmut
, it will be impossible to calld.ref_mutable()
.
Alright! We’ve seen ownership, references, mutability, moving, and copying. Play around with the examples on the
playground to see how and why compilation breaks, and especially take note of how awesome rustc
is in telling you
exactly why things don’t work.
Conclusion
I was originally planning on cramming a lot more into this post, but I decided to split things up because there’s a LOT of ground to cover.
In summary, here’s what you need to know:
- A value can (usually) only have one owner, not zero, and not more than one.
- Variable bindings are either mutable or immutable.
- Moving a value to another owner destroys the original owner.
- Moving a value to another owner destroys all previous references to that value.
- If a type is
std::marker::Copy
, its value is copied in what would normally be a move. - A reference, or a “borrow,” of a value can be mutable or immutable.
- There can either be exactly one mutable reference to a value or many immutable references. Never both at the same time.
- A reference to a value can not outlive the value that it refers to: Rust will not allow you to do this.
Next in the series, I will cover lifetimes, sized and unsized types, and thread safety with Send
and Sync
.