Extra Ownership Practice

Will it compile?

Here are a few examples designed to refine your understanding of Rust’s ownership model. For each of these examples, think about:

Ownership and mutability

Here’s our first example:

fn main() {
    let s = String::from("hello");
    s.push_str(" world");
}

This fails, because s is immutable by default, and push_str would mutate the string.

In C and C++ (and most other languages), you need to explicitly designate variables as immutable using const. In C++, a const string cannot be mutated and would also have a compiler error. In C, the const keyword is a real mess… const char* gets parsed as (const char)*, meaning you’re not allowed to modify the destination buffer, but you could reassign that variable to point to a different string. You can also write char* const, which does the opposite: you can’t reassign the variable to point to a different string, but you can modify the string buffer. If you want to get a true immutable string, you have to use the type const char* const (or char const* const), which which is just* just silly silly. (Demo here) Also, const was introduced later in C’s development, so const gets used very inconsistently throughout the standard library, and it’s not uncommon to need to insert dubious casts to get your code to compile.

To fix, we need to use the mut keyword:

fn main() {
    let mut s = String::from("hello");
    s.push_str(" world");
}

Let’s take a look at passing variables to functions. Does this code compile?

fn om_nom_nom(s: String) {
    println!("{}", s);
}

fn main() {
  let s = String::from("hello");
  om_nom_nom(s);
}

This works! What if we add another om_nom_nom call?

fn om_nom_nom(param: String) {
    println!("{}", param);
}

fn main() {
  let s = String::from("hello");
  om_nom_nom(s);
  om_nom_nom(s);
}

The compiler complains about ownership here. Let’s break this down:

Important note for understanding: I think a lot of people look at this demo and think, oh my gosh, that’s annoying. Why does the compiler have to make things so complicated? However, I would argue that this only looks silly because it doesn’t have any mallocs or frees, and because the code is so short. Let’s look at how we might have done this in C (keeping in mind that String is a heap-allocated buffer).

We could have written the code like this:

void om_nom_nom(char* s) {
    printf("%s\n", s);
}

int main() {
    char* s = strdup("hello");
    om_nom_nom(s);
    om_nom_nom(s);
    free(s);
}

Or like this:

void om_nom_nom(char* s) {
    printf("%s\n", s);
    free(s);
}

int main() {
    char* s = strdup("hello");
    om_nom_nom(s);
    om_nom_nom(s);
}

Or like this:

void om_nom_nom(char* s) {
    printf("%s\n", s);
    free(s);
}

int main() {
    char* s = strdup("hello");
    om_nom_nom(s);
    om_nom_nom(s);
    free(s);
}

Or like this:

void om_nom_nom(char* s) {
    printf("%s\n", s);
}

int main() {
    char* s = strdup("hello");
    om_nom_nom(s);
    om_nom_nom(s);
}

Of these four possibilities, only one works without memory errors. Keep in mind that this is a trivial example, and real systems code is far more complex. 100+ line functions aren’t rare, and it’s not uncommon to have memory that is allocated in one place and freed 9 hours and 2000k lines of code later. It’s extremely important to maintain some notion of ownership, i.e. some notion of who is responsible for cleaning up resources.

Exceptions to ownership

What if we pass a u32 (unsigned int) instead of a String? Is this always an issue?

fn om_nom_nom(param: u32) {
    println!("{}", param);
}

fn main() {
    let x = 1;
    om_nom_nom(x);
    om_nom_nom(x);
}

This actually works fine! As mentioned on last week’s lecture, the type u32 implements a “copy trait” that changes what happens when it is assigned to variables or passed as a parameter. We will talk more about traits in a couple of weeks, but for now, just know that if a type implements the copy trait (has some “copy” function associated with it), then it is copied on assignment and when passed as a parameter.

This is probably pretty confusing. How are you supposed to anticipate whether the compiler will copy a value when you pass it, or whether it will use ownership semantics? Unfortunately, you kind of just need to know about the types you’re using. The good news is that the vast majority of types aren’t tricky like this and use normal ownership semantics. Only primitive types and a handful of others use copy semantics.

References

Let’s talk about borrowing. How does this code look to you?

fn main() {
    let s = String::from("hello");
    let s1 = &s;
    let s2 = &s;
    println!("{} {}", s, s1);
}

This code works fine because s, s1, and s2 are all immutable. Remember, you can have as many read-only pointers to something as you want, as long as no one can change what is being pointed to. (We want to avoid the scenario where chaos ensues because people are making sneak edits to the Google doc while others are trying to read it over.)

What if we bring mutable references into the mix?

fn main() {
    let s = String::from("hello");
    let s1 = &mut s;
    let s2 = &s;
    println!("{} {}", s, s1);
}

This fails to compile because s is immutable, and on the next line, we try to borrow a mutable reference to s. If this were allowed, we could modify the string using s1, even though it was supposed to be immutable.

Let’s fix that by declaring s as mutable:

fn main() {
    let mut s = String::from("hello");
    let s1 = &mut s;
    let s2 = &s;
    println!("{} {} {}", s, s1, s2);
}

This fails again, but for a different reason.

Let’s remove the second borrow. Does this work?

fn main() {
    let mut s = String::from("hello");
    let s1 = &mut s;
    println!("{} {}", s, s1);
}

Here’s the compiler error:

error[E0502]: cannot borrow `s` as immutable because it is also borrowed as mutable
 --> src/main.rs:4:23
  |
3 |     let s1 = &mut s;
  |              ------ mutable borrow occurs here
4 |     println!("{} {}", s, s1);
  |                       ^  -- mutable borrow later used here
  |                       |
  |                       immutable borrow occurs here

The compiler is saying “hey, you borrowed s here, into s1. Now you’re trying to use s, but you haven’t gotten the value back yet. I can’t give you the value back yet, because s1 is still going to be used (as the second thing being printed in that println).

How about this code?

fn main() {
    let mut s = String::from("hello");
    let s1 = &mut s;
    println!("{}", s1);
    println!("{}", s)
}

Unlike the previous example, this actually works. After the first println, Rust sees that s1 will not be used again, so it “returns” the borrowed value back to s. Then, when we try to use s, everything checks out. 👌

Here’s a question we got from a survey last year:

“One thing that’s confusing is why sometimes I need to &var and other times I can just use var: for example, set.contains(&var), but set.insert(var) – why?"

Can you answer this question based on your understanding of references now?

When inserting an item into a set, we want to transfer ownership of that item into the set; that way, the item will exist as long as the set exists. (It would be bad if you added a string to the set, and then someone freed the string while it was still a member of the set.) However, when trying to see if the set contains an item, we want to retain ownership, so we only pass a reference.

Borrowing and references: back to iterator invalidation

In class, I zoomed through this code example, which I argued illustrates the utility of limiting a value to one mutable reference. I’m going to work through this example a bit more here.

fn main() {
    let mut v = vec![1, 2, 3];
    /* This for loop borrows the vector above to do its work */    
    for i in &v { 
        println!("{}", i);
        v.push(34);
    }
}

First, I want to be clear that if you wrote equivalent code in C++, it would compile fine, but it would be unsafe. See the explanation of “iterator invalidation” related to the vector example described in lecture 3. The issue here is that we’re modifying a vector while trying to iterate through it. Since pushing new values to a data structure will often result in data being moved around in memory under the hood, it’s very likely that you’ll eventually access freed, arbitrary memory on the heap. This will likely crash your program; though, since we’re printing here, it could also leak private data.

Aside: When I (Thea) was a CA for CS144, I worked with a student who had an iterator invalidation bug: they called a function that conditionally (in an if statement) modified a data structure inside of a loop that iterated through that same data structure. They didn’t catch the bug during testing, because it happened to work fine when they ran it. When we ran their program through our autograder the first time, almost every single test failed with a segmentation fault. In other words, their program sometimes worked, with some inputs, on their machine – and happened to work almost all the time when they were testing – and sometimes accessed invalid memory, resulting in a segmentation fault. (In case you’re wondering, we emailed them to ask about it, and they found and fixed the bug!)

Ok. Back to Rust. When compiling the above code, we get the following error:

error[E0502]: cannot borrow `v` as mutable because it is also borrowed as immutable
 --> src/main.rs:6:9
  |
4 |     for i in &v { 
  |              --
  |              |
  |              immutable borrow occurs here
  |              immutable borrow later used here
5 |         println!("{}", i);
6 |         v.push(34);
  |         ^^^^^^^^^^ mutable borrow occurs here

To break this down a bit, I want to make the function calls that are happening here more explicit.

for i in &v:

v.push(34);

So, why do we get the above error?

Let’s play around with this code a bit. What if we take a mutable reference to v in the for loop?

fn main() {
    let mut v = vec![1, 2, 3];
    /* This for loop borrows the vector above to do its work */    
    for i in &mut v { 
        println!("{}", i);
        v.push(34);
    }
}

Again, we get an error:

error[E0499]: cannot borrow `v` as mutable more than once at a time
 --> src/main.rs:6:9
  |
4 |     for i in &mut v { 
  |              ------
  |              |
  |              first mutable borrow occurs here
  |              first borrow later used here
5 |         println!("{}", i);
6 |         v.push(34);
  |         ^^^^^^^^^^ second mutable borrow occurs here

for i in &mut v creates a mutable reference to v. While that’s in scope, we can’t create a second mutable reference to v – which is exactly what v.push tries to do.

What if we got rid of the & altogether?

fn main() {
    let mut v = vec![1, 2, 3];
    for i in v { 
        println!("{}", i);
        v.push(34);
    }
}

This gives us an entirely different error:

error[E0382]: borrow of moved value: `v`
   --> src/main.rs:5:9
    |
2   |     let mut v = vec![1, 2, 3];
    |         ----- move occurs because `v` has type `Vec<i32>`, which does not implement the `Copy` trait
3   |     for i in v { 
    |              -
    |              |
    |              `v` moved due to this implicit call to `.into_iter()`
    |              help: consider borrowing to avoid moving into the for loop: `&v`
4   |         println!("{}", i);
5   |         v.push(34);
    |         ^^^^^^^^^^ value borrowed here after move
    |
note: this function takes ownership of the receiver `self`, which moves `v`

This is a transfer of ownership:

Again, what’s important in this example is not Rust syntax, which will become more clear as we move forward. What I want you to take away from this is that these (sometimes annoying) Rust rules can help prevent real-world issues that we observe in C/C++.

Conclusion: in class, someone asked, “how would we modify the code to make this work?” This is a great question, and the answer is we can’t, and that’s a good thing. Modifying a data structure when you’re iterating through it is a fundamentally incorrect thing to do, and the Rust compiler stops us from doing it for a reason.