Sep 14, 03:15 PM
I recently saw a post on Mastodon by someone wrestling with learning Rust, who had conceptualized these three concepts thus:
- copy is a copy
- move is a shallow copy
- clone is a deep copy (but not always)
Rust can be confusing; it has some concepts that just don’t exist in other languages, and it considers certain semantics differently from other languages. I am in no way any kind of “Rust expert”; I’ve just written and read quite a bit of it (and quite a bit about it), and think that I understand this particular situation well enough to clear up this confusion.
Moves and Clones/Copies are conceptually different things; they sometimes happen in concert; they sometimes don’t. Let’s talk about them carefully and straighten this out.
Move
Right off the bat here, we have Rust’s ownership model (one of the aforementioned things that are “unusual” to Rust) cannonballing into our pool. Fundamental to ownership is the idea of a move. A move is just an agreement between you and the compiler (enforced, of course, by the compiler, because that’s what she does1) that you won’t use a variable binding beyond a certain point. This doesn’t necessarily involve any sort of copying.
let a = String::from("foo"); let b = a; // Can't use 'a' anymore; have to call it `b`.
Here we “move” the value of a
into b
by assignment, but no copying
needs to take place. We have just changed the name by which we refer to
the string we created.
Sometimes this can involve copying.
struct Foo { name: String, number: i32, } let a = String::from("foo"); let foo = Foo { name: a, // Can't use 'a' anymore; have it call it `foo.name`. number: 12, };
Here the String
a
’s underlying fat pointer was most likely copied,
because structs hold their members in a contiguous (save any alignment
shims) block.
The point of a move is not to copy, but to ensure that a variable binding isn’t used once using it might cause trouble. Here’s an obvious example of where this might happen:
let a = String::from("foo"); let _ = std::thread::spawn(move || { function_that_mutates_a_string(a); }); // Obviously, we shouldn't try to mess with `a` here anymore.
Clone
Now, the purpose of a clone is, in fact, to copy some bits. If you want to perform an operation that requires moving a value, but still want to be able to use that value afterward, you clone it. Whether this is a shallow or a deep copy depends on the type. For example, this is a deep copy:
let a = String::from("foo"); let b = a.clone(); let _ = std::thread::spawn(move || { function_that_mutates_a_string(a); }); // `a` is gone at this point, but we can still use `b`.
The call to a.clone()
allocates a chunk on the heap, and copies the 'f'
,
'o'
, and 'o'
bytes into the new allocation. This is a deep copy,
because String
s are a “pointer type”, and we’ve not just copied the
pointer value—we’ve copied what’s pointed to. But that’s not the only
difference. Check out the following snippet in the
Playground:
fn main() { let mut a = String::from("frogs"); a.push_str("!"); println!("{}", &a.capacity()); // prints "10" let b = a.clone(); println!("{}", &b.capacity()); // prints "6" }
They also point to differently-sized chunks of memory, and accordingly the “capacity” values are different.
In general, the only types that yield “shallow” copies are explicitly reference-counted types.
use std::sync::Arc; let a = Arc::new(String::from("foo")); let b = a.clone();
There is only one string in the above example, but you can read it
through both a
and b
. Because of Rust’s explicit ownership system,
unless we have a specific need for reference counting, we just pass
references around, sparing us the overhead of counting them, confident
that the borrow checker and ownership system won’t let us screw it up..
Here’s a slightly more complicated example of a deep copy (on Playground):
#[derive(Clone, Debug)] struct Point { x: f64, y: f64, } #[derive(Clone, Debug)] struct Line { start: Point, end: Point, } fn main() { let mut a = Line { start: Point { x: 0.0, y: 0.0 }, end: Point { x: 4.0, y: 3.0 }, }; let b = a.clone(); a.x.start = 1.0; println!("{:?}", &a); // Line { start: Point { x: 1.0, ... println!("{:?}", &b); // Line { start: Point { x: 0.0, ... }
You can see that this is a deep copy, as mutating a.x
has no effect on
the value of b.x
.
Now, because our Line
type doesn’t contain any complicated references
or heap allocations, at the moment of our clone (before we mutate a
),
unlike with the String
example above, b
is an exact bitwise copy of
a
; we can clone a Line
merely by copying its representation in
memory byte-for-byte to a different location. This brings us to…
Copy
The Copy
trait is just automatic Clone
ing behavior for types that can
be cloned with just a bitwise copy. If your struct or enum can be safely
copied this way, you can #[derive(Clone, Copy)]
on it, and any time one
of its values gets moved out of a binding, the compiler will (if you use
that same binding again later) automatically insert a .clone()
at the
site of the move for you.
Here is the previous snippet again, but with Copy
derived for our types
(on Playground):
#[derive(Clone, Copy, Debug)] struct Point { x: f64, y: f64, } #[derive(Clone, Copy, Debug)] struct Line { start: Point, end: Point, } fn main() { let mut a = Line { start: Point { x: 0.0, y: 0.0 }, end: Point { x: 4.0, y: 3.0 }, }; let b = a; // No call to `.clone()`; if `Line` weren't Copy, // it'd be moved. a.start.x = 1.0; // But we can still use `a` here, because the // compiler cloned it automatically. println!("{:?}", &a); println!("{:?}", &b); }
This example has identical behavior to the previous one with the explicit
.clone()
. Deriving Copy
on a type is essentially you telling the compiler,
“Okay, any time you feel like coming back with a ‘use of moved value’ error
about one of these, just go ahead and insert the necessary clone yourself.”
And again, this is a deep copy; as I mentioned before, most actual copies in Rust are deep.
To Summarize
- Sometimes bits get copied at a move site, but the move itself is really just how we conceptualize one method the compiler uses to prevent us from aliasing values in a potentially problematic way.
- When writing Rust, we tend to think about explicit reference
passing or explicit cloning instead of shallow or deep copying.
Almost all explicit clones are deep; when they’re not, we
definitely know, because we’ve had to specifically go out of our
way to get this behavior (by wrapping our value in something like
an
Arc
or anRc
). - For types that support it,
Copy
is (from the perspective of program behavior) identical toClone
. The only difference is that, for types whose values can be safely copied bit-for-bit, you can explicitly tell the compiler (through#[derive(Copy)]
) that it’s okay for the compiler to implicitly clone them if it feels like it needs to.
I hope this has been clear and helpful. If anyone has any further questions, or if an actual Rust expert happens to notice that I’ve gotten something wrong here, you can find me at @d2718@hachyderm.io.
- What with the strict type system and the borrow checker, it’s easy to think of the Rust compiler’s chief duty as being enforcement rather than, say, code generation. [return]