Extra Ownership Practice
Will it compile?
Here are a few examples designed to refine your understanding of Rust’s ownership model. For each of these examples, think about:
- Will it compile?
- If not, why not? What could go wrong in an equivalent C or C++ program that does compile?
Ownership and mutability
Here’s our first example:
fn main() {
let s = String::from("hello");
s.push_str(" world");
}
This fails, because s
is immutable by default, and push_str
would mutate
the string.
In C and C++ (and most other languages), you need to explicitly designate
variables as immutable using const
. In C++, a const string
cannot be
mutated and would also have a compiler error. In C, the const
keyword is a
real mess… const char*
gets parsed as (const char)*
, meaning you’re not
allowed to modify the destination buffer, but you could reassign that variable
to point to a different string. You can also write char* const
, which does
the opposite: you can’t reassign the variable to point to a different string,
but you can modify the string buffer. If you want to get a true immutable
string, you have to use the type const char* const
(or char const* const
),
which which is just* just silly silly. (Demo
here) Also, const
was introduced
later in C’s development, so const
gets used very inconsistently throughout
the standard library, and it’s not uncommon to need to insert dubious casts to
get your code to compile.
To fix, we need to use the mut
keyword:
fn main() {
let mut s = String::from("hello");
s.push_str(" world");
}
Let’s take a look at passing variables to functions. Does this code compile?
fn om_nom_nom(s: String) {
println!("{}", s);
}
fn main() {
let s = String::from("hello");
om_nom_nom(s);
}
This works! What if we add another om_nom_nom
call?
fn om_nom_nom(param: String) {
println!("{}", param);
}
fn main() {
let s = String::from("hello");
om_nom_nom(s);
om_nom_nom(s);
}
The compiler complains about ownership here. Let’s break this down:
- On the first line of
main
,s
owns the string. - On the next line, ownership gets transferred to the
param
parameter ofom_nom_nom
- When
om_nom_nom
returns,param
goes out of scope, and ownership of the string hasn’t been transferred anywhere else, so the string is “dropped” and the string’s memory is freed. - Back in
main
, on the third line, we try to uses
again. However, we previously gaves
away (and in facts
has already been destroyed). The compiler complains with an error explaining this:error[E0382]: use of moved value: `s` --> src/main.rs:8:14 | 6 | let s = String::from("hello"); | - move occurs because `s` has type `std::string::String`, which does not implement the `Copy` trait 7 | om_nom_nom(s); | - value moved here 8 | om_nom_nom(s); | ^ value used here after move error: aborting due to previous error
Important note for understanding: I think a lot of people look at this demo
and think, oh my gosh, that’s annoying. Why does the compiler have to make
things so complicated? However, I would argue that this only looks silly
because it doesn’t have any malloc
s or free
s, and because the code is so
short. Let’s look at how we might have done this in C (keeping in mind that
String
is a heap-allocated buffer).
We could have written the code like this:
void om_nom_nom(char* s) {
printf("%s\n", s);
}
int main() {
char* s = strdup("hello");
om_nom_nom(s);
om_nom_nom(s);
free(s);
}
Or like this:
void om_nom_nom(char* s) {
printf("%s\n", s);
free(s);
}
int main() {
char* s = strdup("hello");
om_nom_nom(s);
om_nom_nom(s);
}
Or like this:
void om_nom_nom(char* s) {
printf("%s\n", s);
free(s);
}
int main() {
char* s = strdup("hello");
om_nom_nom(s);
om_nom_nom(s);
free(s);
}
Or like this:
void om_nom_nom(char* s) {
printf("%s\n", s);
}
int main() {
char* s = strdup("hello");
om_nom_nom(s);
om_nom_nom(s);
}
Of these four possibilities, only one works without memory errors. Keep in mind that this is a trivial example, and real systems code is far more complex. 100+ line functions aren’t rare, and it’s not uncommon to have memory that is allocated in one place and freed 9 hours and 2000k lines of code later. It’s extremely important to maintain some notion of ownership, i.e. some notion of who is responsible for cleaning up resources.
Exceptions to ownership
What if we pass a u32
(unsigned int) instead of a String
? Is this always an issue?
fn om_nom_nom(param: u32) {
println!("{}", param);
}
fn main() {
let x = 1;
om_nom_nom(x);
om_nom_nom(x);
}
This actually works fine! As mentioned on last week’s lecture, the type u32
implements a “copy trait” that changes what happens when it is assigned to variables or passed as a parameter. We will talk more about traits in a couple of weeks, but for now, just know that if a type implements the copy trait (has some “copy” function associated with it), then it is copied on assignment and when passed as a parameter.
This is probably pretty confusing. How are you supposed to anticipate whether the compiler will copy a value when you pass it, or whether it will use ownership semantics? Unfortunately, you kind of just need to know about the types you’re using. The good news is that the vast majority of types aren’t tricky like this and use normal ownership semantics. Only primitive types and a handful of others use copy semantics.
References
Let’s talk about borrowing. How does this code look to you?
fn main() {
let s = String::from("hello");
let s1 = &s;
let s2 = &s;
println!("{} {}", s, s1);
}
This code works fine because s
, s1
, and s2
are all immutable. Remember,
you can have as many read-only pointers to something as you want, as long as no
one can change what is being pointed to. (We want to avoid the scenario where
chaos ensues because people are making sneak edits to the Google doc while
others are trying to read it over.)
What if we bring mutable references into the mix?
fn main() {
let s = String::from("hello");
let s1 = &mut s;
let s2 = &s;
println!("{} {}", s, s1);
}
This fails to compile because s
is immutable, and on the next line, we try to
borrow a mutable reference to s
. If this were allowed, we could modify the
string using s1
, even though it was supposed to be immutable.
Let’s fix that by declaring s
as mutable:
fn main() {
let mut s = String::from("hello");
let s1 = &mut s;
let s2 = &s;
println!("{} {} {}", s, s1, s2);
}
This fails again, but for a different reason.
- We first declare
s
as mutable. 👍 - We borrow a mutable reference to
s
. 👍 - We try to borrow an immutable reference to
s
. However, there already exists a mutable reference tos
. Rust doesn’t allow multiple references to exist when a mutable reference has been borrowed. Otherwise, the mutable reference could be used to change (potentially reallocate) memory when code using the other references least expect it.
Let’s remove the second borrow. Does this work?
fn main() {
let mut s = String::from("hello");
let s1 = &mut s;
println!("{} {}", s, s1);
}
- We first declare
s
as mutable. 👍 - We borrow a mutable reference to
s
. 👍 - We try to use
s
. However, the value has been “borrowed out” tos1
and hasn’t been “returned” yet. As such, we can’t uses
.
Here’s the compiler error:
error[E0502]: cannot borrow `s` as immutable because it is also borrowed as mutable
--> src/main.rs:4:23
|
3 | let s1 = &mut s;
| ------ mutable borrow occurs here
4 | println!("{} {}", s, s1);
| ^ -- mutable borrow later used here
| |
| immutable borrow occurs here
The compiler is saying “hey, you borrowed s
here, into s1
. Now you’re
trying to use s
, but you haven’t gotten the value back yet. I can’t give you
the value back yet, because s1
is still going to be used (as the second thing
being printed in that println
).
How about this code?
fn main() {
let mut s = String::from("hello");
let s1 = &mut s;
println!("{}", s1);
println!("{}", s)
}
Unlike the previous example, this actually works. After the first println
,
Rust sees that s1
will not be used again, so it “returns” the borrowed value
back to s
. Then, when we try to use s
, everything checks out. 👌
Here’s a question we got from a survey last year:
“One thing that’s confusing is why sometimes I need to &var and other times I can just use var: for example, set.contains(&var), but set.insert(var) – why?"
Can you answer this question based on your understanding of references now?
When inserting an item into a set, we want to transfer ownership of that item into the set; that way, the item will exist as long as the set exists. (It would be bad if you added a string to the set, and then someone freed the string while it was still a member of the set.) However, when trying to see if the set contains an item, we want to retain ownership, so we only pass a reference.
Borrowing and references: back to iterator invalidation
In class, I zoomed through this code example, which I argued illustrates the utility of limiting a value to one mutable reference. I’m going to work through this example a bit more here.
fn main() {
let mut v = vec![1, 2, 3];
/* This for loop borrows the vector above to do its work */
for i in &v {
println!("{}", i);
v.push(34);
}
}
First, I want to be clear that if you wrote equivalent code in C++, it would compile fine, but it would be unsafe. See the explanation of “iterator invalidation” related to the vector example described in lecture 3. The issue here is that we’re modifying a vector while trying to iterate through it. Since pushing new values to a data structure will often result in data being moved around in memory under the hood, it’s very likely that you’ll eventually access freed, arbitrary memory on the heap. This will likely crash your program; though, since we’re printing here, it could also leak private data.
Aside: When I (Thea) was a CA for CS144, I worked with a student who had an iterator invalidation bug: they called a function that conditionally (in an if
statement) modified a data structure inside of a loop that iterated through that same data structure. They didn’t catch the bug during testing, because it happened to work fine when they ran it. When we ran their program through our autograder the first time, almost every single test failed with a segmentation fault. In other words, their program sometimes worked, with some inputs, on their machine – and happened to work almost all the time when they were testing – and sometimes accessed invalid memory, resulting in a segmentation fault. (In case you’re wondering, we emailed them to ask about it, and they found and fixed the bug!)
Ok. Back to Rust. When compiling the above code, we get the following error:
error[E0502]: cannot borrow `v` as mutable because it is also borrowed as immutable
--> src/main.rs:6:9
|
4 | for i in &v {
| --
| |
| immutable borrow occurs here
| immutable borrow later used here
5 | println!("{}", i);
6 | v.push(34);
| ^^^^^^^^^^ mutable borrow occurs here
To break this down a bit, I want to make the function calls that are happening here more explicit.
for i in &v
:
- There’s actually an implicit function call here. The function creates an “iterator”, an object that allows us to loop through the vector.
- We give this function an immutable reference to
v
(&v
). We then get immutable references to items inv
, one-by-one, stored in the variablei
. - What’s important here: for the scope of this
for
loop – until the closing}
– there’s an immutable reference tov
in existence.
v.push(34);
- This calls a function that pushes an element to the vector
v
. - There’s an implicit parameter here:
push
takes in a mutable reference tov
as its first parameter. In other words,v.push(34)
could, conceptually, be written aspush(&mut v, 34)
. (Don’t worry about this syntax yet! I just want to make explicit the reference requirement here.)
So, why do we get the above error?
for i in &v
initializes an immutable reference tov
.v.push(34)
(tries to) initialize a mutable reference tov
.- However, the original immutable reference to
v
is still in scope. - We can’t have a mutable and immutable reference in existence at the same time.
- Bad times.
Let’s play around with this code a bit. What if we take a mutable reference to v
in the for loop?
fn main() {
let mut v = vec![1, 2, 3];
/* This for loop borrows the vector above to do its work */
for i in &mut v {
println!("{}", i);
v.push(34);
}
}
Again, we get an error:
error[E0499]: cannot borrow `v` as mutable more than once at a time
--> src/main.rs:6:9
|
4 | for i in &mut v {
| ------
| |
| first mutable borrow occurs here
| first borrow later used here
5 | println!("{}", i);
6 | v.push(34);
| ^^^^^^^^^^ second mutable borrow occurs here
for i in &mut v
creates a mutable reference to v
. While that’s in scope, we can’t create a second mutable reference to v
– which is exactly what v.push
tries to do.
What if we got rid of the &
altogether?
fn main() {
let mut v = vec![1, 2, 3];
for i in v {
println!("{}", i);
v.push(34);
}
}
This gives us an entirely different error:
error[E0382]: borrow of moved value: `v`
--> src/main.rs:5:9
|
2 | let mut v = vec![1, 2, 3];
| ----- move occurs because `v` has type `Vec<i32>`, which does not implement the `Copy` trait
3 | for i in v {
| -
| |
| `v` moved due to this implicit call to `.into_iter()`
| help: consider borrowing to avoid moving into the for loop: `&v`
4 | println!("{}", i);
5 | v.push(34);
| ^^^^^^^^^^ value borrowed here after move
|
note: this function takes ownership of the receiver `self`, which moves `v`
This is a transfer of ownership:
- Originally,
v
owns the vector object. - Then, we call
for i in v
(note the lack of&
). There’s an implicit call here to.into_iter()
, which creates an iterator for the vector. By getting rid of&
, we’ve transferred ownership of the vector to the iterator. - When we call
v.push(34)
, the compiler complains: the vector object can only have one owner, and that owner is no longerv
. We can no longer usev
to access the vector object.
Again, what’s important in this example is not Rust syntax, which will become more clear as we move forward. What I want you to take away from this is that these (sometimes annoying) Rust rules can help prevent real-world issues that we observe in C/C++.
Conclusion: in class, someone asked, “how would we modify the code to make this work?” This is a great question, and the answer is we can’t, and that’s a good thing. Modifying a data structure when you’re iterating through it is a fundamentally incorrect thing to do, and the Rust compiler stops us from doing it for a reason.