Extra Ownership Practice

Will it compile?

Here are a few examples designed to refine your understanding of Rust’s ownership model. For each of these examples, think about:

Will it compile?
If not, why not? What could go wrong in an equivalent C or C++ program that does compile?

Ownership and mutability

Here’s our first example:

fn main() {
    let s = String::from("hello");
    s.push_str(" world");
}

This fails, because s is immutable by default, and push_str would mutate the string.

In C and C++ (and most other languages), you need to explicitly designate variables as immutable using const. In C++, a const string cannot be mutated and would also have a compiler error. In C, the const keyword is a real mess… const char* gets parsed as (const char)*, meaning you’re not allowed to modify the destination buffer, but you could reassign that variable to point to a different string. You can also write char* const, which does the opposite: you can’t reassign the variable to point to a different string, but you can modify the string buffer. If you want to get a true immutable string, you have to use the type const char* const (or char const* const), which which is just* just silly silly. (Demo here) Also, const was introduced later in C’s development, so const gets used very inconsistently throughout the standard library, and it’s not uncommon to need to insert dubious casts to get your code to compile.

To fix, we need to use the mut keyword:

fn main() {
    let mut s = String::from("hello");
    s.push_str(" world");
}

Let’s take a look at passing variables to functions. Does this code compile?

fn om_nom_nom(s: String) {
    println!("{}", s);
}

fn main() {
  let s = String::from("hello");
  om_nom_nom(s);
}

This works! What if we add another om_nom_nom call?

fn om_nom_nom(param: String) {
    println!("{}", param);
}

fn main() {
  let s = String::from("hello");
  om_nom_nom(s);
  om_nom_nom(s);
}

The compiler complains about ownership here. Let’s break this down:

On the first line of main, s owns the string.
On the next line, ownership gets transferred to the param parameter of om_nom_nom
When om_nom_nom returns, param goes out of scope, and ownership of the string hasn’t been transferred anywhere else, so the string is “dropped” and the string’s memory is freed.

Back in main, on the third line, we try to use s again. However, we previously gave s away (and in fact s has already been destroyed). The compiler complains with an error explaining this:

error[E0382]: use of moved value: `s`
 --> src/main.rs:8:14
  |
6 |   let s = String::from("hello");
  |       - move occurs because `s` has type `std::string::String`, which does not implement the `Copy` trait
7 |   om_nom_nom(s);
  |              - value moved here
8 |   om_nom_nom(s);
  |              ^ value used here after move

error: aborting due to previous error

Important note for understanding: I think a lot of people look at this demo and think, oh my gosh, that’s annoying. Why does the compiler have to make things so complicated? However, I would argue that this only looks silly because it doesn’t have any mallocs or frees, and because the code is so short. Let’s look at how we might have done this in C (keeping in mind that String is a heap-allocated buffer).

We could have written the code like this:

void om_nom_nom(char* s) {
    printf("%s\n", s);
}

int main() {
    char* s = strdup("hello");
    om_nom_nom(s);
    om_nom_nom(s);
    free(s);
}

Or like this:

void om_nom_nom(char* s) {
    printf("%s\n", s);
    free(s);
}

int main() {
    char* s = strdup("hello");
    om_nom_nom(s);
    om_nom_nom(s);
}

Or like this:

void om_nom_nom(char* s) {
    printf("%s\n", s);
    free(s);
}

int main() {
    char* s = strdup("hello");
    om_nom_nom(s);
    om_nom_nom(s);
    free(s);
}

Or like this:

void om_nom_nom(char* s) {
    printf("%s\n", s);
}

int main() {
    char* s = strdup("hello");
    om_nom_nom(s);
    om_nom_nom(s);
}

Of these four possibilities, only one works without memory errors. Keep in mind that this is a trivial example, and real systems code is far more complex. 100+ line functions aren’t rare, and it’s not uncommon to have memory that is allocated in one place and freed 9 hours and 2000k lines of code later. It’s extremely important to maintain some notion of ownership, i.e. some notion of who is responsible for cleaning up resources.

Exceptions to ownership

What if we pass a u32 (unsigned int) instead of a String? Is this always an issue?

fn om_nom_nom(param: u32) {
    println!("{}", param);
}

fn main() {
    let x = 1;
    om_nom_nom(x);
    om_nom_nom(x);
}

This actually works fine! As mentioned on last week’s lecture, the type u32 implements a “copy trait” that changes what happens when it is assigned to variables or passed as a parameter. We will talk more about traits in a couple of weeks, but for now, just know that if a type implements the copy trait (has some “copy” function associated with it), then it is copied on assignment and when passed as a parameter.

This is probably pretty confusing. How are you supposed to anticipate whether the compiler will copy a value when you pass it, or whether it will use ownership semantics? Unfortunately, you kind of just need to know about the types you’re using. The good news is that the vast majority of types aren’t tricky like this and use normal ownership semantics. Only primitive types and a handful of others use copy semantics.

References

Let’s talk about borrowing. How does this code look to you?

fn main() {
    let s = String::from("hello");
    let s1 = &s;
    let s2 = &s;
    println!("{} {}", s, s1);
}

This code works fine because s, s1, and s2 are all immutable. Remember, you can have as many read-only pointers to something as you want, as long as no one can change what is being pointed to. (We want to avoid the scenario where chaos ensues because people are making sneak edits to the Google doc while others are trying to read it over.)

What if we bring mutable references into the mix?

fn main() {
    let s = String::from("hello");
    let s1 = &mut s;
    let s2 = &s;
    println!("{} {}", s, s1);
}

This fails to compile because s is immutable, and on the next line, we try to borrow a mutable reference to s. If this were allowed, we could modify the string using s1, even though it was supposed to be immutable.

Let’s fix that by declaring s as mutable:

fn main() {
    let mut s = String::from("hello");
    let s1 = &mut s;
    let s2 = &s;
    println!("{} {} {}", s, s1, s2);
}

This fails again, but for a different reason.

We first declare s as mutable. 👍
We borrow a mutable reference to s. 👍
We try to borrow an immutable reference to s. However, there already exists a mutable reference to s. Rust doesn’t allow multiple references to exist when a mutable reference has been borrowed. Otherwise, the mutable reference could be used to change (potentially reallocate) memory when code using the other references least expect it.

Let’s remove the second borrow. Does this work?

fn main() {
    let mut s = String::from("hello");
    let s1 = &mut s;
    println!("{} {}", s, s1);
}

We first declare s as mutable. 👍
We borrow a mutable reference to s. 👍
We try to use s. However, the value has been “borrowed out” to s1 and hasn’t been “returned” yet. As such, we can’t use s.

Here’s the compiler error:

error[E0502]: cannot borrow `s` as immutable because it is also borrowed as mutable
 --> src/main.rs:4:23
  |
3 |     let s1 = &mut s;
  |              ------ mutable borrow occurs here
4 |     println!("{} {}", s, s1);
  |                       ^  -- mutable borrow later used here
  |                       |
  |                       immutable borrow occurs here

The compiler is saying “hey, you borrowed s here, into s1. Now you’re trying to use s, but you haven’t gotten the value back yet. I can’t give you the value back yet, because s1 is still going to be used (as the second thing being printed in that println).

How about this code?

fn main() {
    let mut s = String::from("hello");
    let s1 = &mut s;
    println!("{}", s1);
    println!("{}", s)
}

Unlike the previous example, this actually works. After the first println, Rust sees that s1 will not be used again, so it “returns” the borrowed value back to s. Then, when we try to use s, everything checks out. 👌

Here’s a question we got from a survey last year:

“One thing that’s confusing is why sometimes I need to &var and other times I can just use var: for example, set.contains(&var), but set.insert(var) – why?"

Can you answer this question based on your understanding of references now?

When inserting an item into a set, we want to transfer ownership of that item into the set; that way, the item will exist as long as the set exists. (It would be bad if you added a string to the set, and then someone freed the string while it was still a member of the set.) However, when trying to see if the set contains an item, we want to retain ownership, so we only pass a reference.

Borrowing and references: back to iterator invalidation

In class, I zoomed through this code example, which I argued illustrates the utility of limiting a value to one mutable reference. I’m going to work through this example a bit more here.

fn main() {
    let mut v = vec![1, 2, 3];
    /* This for loop borrows the vector above to do its work */    
    for i in &v { 
        println!("{}", i);
        v.push(34);
    }
}

First, I want to be clear that if you wrote equivalent code in C++, it would compile fine, but it would be unsafe. See the explanation of “iterator invalidation” related to the vector example described in lecture 3. The issue here is that we’re modifying a vector while trying to iterate through it. Since pushing new values to a data structure will often result in data being moved around in memory under the hood, it’s very likely that you’ll eventually access freed, arbitrary memory on the heap. This will likely crash your program; though, since we’re printing here, it could also leak private data.

Aside: When I (Thea) was a CA for CS144, I worked with a student who had an iterator invalidation bug: they called a function that conditionally (in an if statement) modified a data structure inside of a loop that iterated through that same data structure. They didn’t catch the bug during testing, because it happened to work fine when they ran it. When we ran their program through our autograder the first time, almost every single test failed with a segmentation fault. In other words, their program sometimes worked, with some inputs, on their machine – and happened to work almost all the time when they were testing – and sometimes accessed invalid memory, resulting in a segmentation fault. (In case you’re wondering, we emailed them to ask about it, and they found and fixed the bug!)

Ok. Back to Rust. When compiling the above code, we get the following error:

error[E0502]: cannot borrow `v` as mutable because it is also borrowed as immutable
 --> src/main.rs:6:9
  |
4 |     for i in &v { 
  |              --
  |              |
  |              immutable borrow occurs here
  |              immutable borrow later used here
5 |         println!("{}", i);
6 |         v.push(34);
  |         ^^^^^^^^^^ mutable borrow occurs here

To break this down a bit, I want to make the function calls that are happening here more explicit.

for i in &v:

There’s actually an implicit function call here. The function creates an “iterator”, an object that allows us to loop through the vector.
We give this function an immutable reference to v (&v). We then get immutable references to items in v, one-by-one, stored in the variable i.
What’s important here: for the scope of this for loop – until the closing } – there’s an immutable reference to v in existence.

v.push(34);

This calls a function that pushes an element to the vector v.
There’s an implicit parameter here: push takes in a mutable reference to v as its first parameter. In other words, v.push(34) could, conceptually, be written as push(&mut v, 34). (Don’t worry about this syntax yet! I just want to make explicit the reference requirement here.)

So, why do we get the above error?

for i in &v initializes an immutable reference to v.
v.push(34) (tries to) initialize a mutable reference to v.
However, the original immutable reference to v is still in scope.
We can’t have a mutable and immutable reference in existence at the same time.
Bad times.

Let’s play around with this code a bit. What if we take a mutable reference to v in the for loop?

fn main() {
    let mut v = vec![1, 2, 3];
    /* This for loop borrows the vector above to do its work */    
    for i in &mut v { 
        println!("{}", i);
        v.push(34);
    }
}

Again, we get an error:

error[E0499]: cannot borrow `v` as mutable more than once at a time
 --> src/main.rs:6:9
  |
4 |     for i in &mut v { 
  |              ------
  |              |
  |              first mutable borrow occurs here
  |              first borrow later used here
5 |         println!("{}", i);
6 |         v.push(34);
  |         ^^^^^^^^^^ second mutable borrow occurs here

for i in &mut v creates a mutable reference to v. While that’s in scope, we can’t create a second mutable reference to v – which is exactly what v.push tries to do.

What if we got rid of the & altogether?

fn main() {
    let mut v = vec![1, 2, 3];
    for i in v { 
        println!("{}", i);
        v.push(34);
    }
}

This gives us an entirely different error:

error[E0382]: borrow of moved value: `v`
   --> src/main.rs:5:9
    |
2   |     let mut v = vec![1, 2, 3];
    |         ----- move occurs because `v` has type `Vec<i32>`, which does not implement the `Copy` trait
3   |     for i in v { 
    |              -
    |              |
    |              `v` moved due to this implicit call to `.into_iter()`
    |              help: consider borrowing to avoid moving into the for loop: `&v`
4   |         println!("{}", i);
5   |         v.push(34);
    |         ^^^^^^^^^^ value borrowed here after move
    |
note: this function takes ownership of the receiver `self`, which moves `v`

This is a transfer of ownership:

Originally, v owns the vector object.
Then, we call for i in v (note the lack of &). There’s an implicit call here to .into_iter(), which creates an iterator for the vector. By getting rid of &, we’ve transferred ownership of the vector to the iterator.
When we call v.push(34), the compiler complains: the vector object can only have one owner, and that owner is no longer v. We can no longer use v to access the vector object.

Again, what’s important in this example is not Rust syntax, which will become more clear as we move forward. What I want you to take away from this is that these (sometimes annoying) Rust rules can help prevent real-world issues that we observe in C/C++.

Conclusion: in class, someone asked, “how would we modify the code to make this work?” This is a great question, and the answer is we can’t, and that’s a good thing. Modifying a data structure when you’re iterating through it is a fundamentally incorrect thing to do, and the Rust compiler stops us from doing it for a reason.

CS 110L

Extra Ownership Practice

Will it compile?

Ownership and mutability

Exceptions to ownership

References

Borrowing and references: back to iterator invalidation