Rvalue references and move semantics can hardly be counted among the beginner’s topics in C++. After encountering them for the first time in a Scott Meyer’s book, I felt deeply confused. Like, what would you ever need them for? And why make an already complex language even more so? It turns out, not only are rvalue references useful, but they also simplify common patterns!
Dynamic resource management, the old way
Take a look at the following rudimentary String
implementation:
class String {
private:
std::size_t length_ {0};
char *data_ { new char[1] { '\0' }};
public:
String(const char * str):
length_{ std::strlen(str) },
data_{ new char[length_ + 1] }
{
std::memcpy(data_, str, length_ + 1);
}
String(const String& other):
length_{ other.length_ },
data_{ new char[ length_ + 1]}
{
std::memcpy(data_, other.data_, length_ + 1);
}
String& operator=(const String& other) {
if (length() != other.length()) {
auto data = new char[other.length_ + 1];
delete[] data_;
data_ = data;
length_ = other.length_;
}
std::memcpy(data_, other.data_, length_ + 1);
return *this;
}
~String() noexcept {
delete[] data_;
}
friend inline std::ostream& operator<<(std::ostream& out, const String& str) {
return out << str.data_;
}
};
String
manages a dynamic memory resource (the char
array data_
). To do so, it implements the RAII idiom, and adhering to the rule of three, defines three special member functions:
- copy constructor,
- copy assignment operator,
- destructor.
Other classes can use String
to store text data without worrying about memory management:
class File {
private:
String name_;
public:
File(const String& name):
name_{ name }
{}
friend inline std::ostream& operator<<(std::ostream& out, const File& file){
return out << "File{ name='" << file.name_ << "' }";
}
};
The snippet of code below demonstrates how String
and File
are used together. When executed, it prints data.in, followed by File{ name=’data.in’ } to the screen.
#include <iostream>
int main() {
String name{ "data.in" };
File file{ name };
std::cout << name << '\n';
std::cout << file;
}
Remarkably, owing to the RAII guarantees, the heap-memory claimed by the name
and the file
’s name_
objects is automatically released when those objects fall out of scope. You can verify that by adding an output statement to the String
’s destructor:
~String::String() noexcept {
std::cout << "Destroying String with content: '" << data_ << "'\n";
delete[] data_;
}
And yet, can you spot a disturbing issue with the current implementation of String
and File
? Yes, it is…
Copies, copies everywhere
The String
object, name
, that contains the data.in is used to create a File
instance. In fact, it is its only purpose, there is no other reason for its existence. It’s created and then passed to the File
’s constructor where a copy of it is made when initializing File::name_
. That makes two String
objects spawning to life in this short program. One of them is an unnecessary copy and you should get rid of it. On the first try, you could skip creating name
altogether by writing:
File file{ "data.in" };
This works because of the implicit conversion from a string literal to String
supported by one of the String
’s constructors. But there is still an extra object created and a copy being made. The additional String
object appears as an argument to the File
’s constructor. It only lives within the constructor’s scope, bound to its const
reference parameter, but it’s there and it’s being copied. That’s just silly–why would you need two String
s when one suffices?
Stealing object’s content
There is a solution to this craziness. The main burden of String
is the ownership of a piece of heap memory. That’s a resource than can be “stolen” by another String
. Stealing is done by hijacking the ownership of the heap-allocated memory:
class String {
public:
void steal_content(String& other) {
// delete currently held own memory
delete[] data_;
// actual "stealing"
length_ = other.length_;
data_ = other.data_;
// making sure other object is still valid
// and that it doesn't refer its old data
other.length_ = 0;
other.data_ = new char[1]{};
}
};
In steal_content
, after the memory is stolen, the just robbed other
object is set to a valid state of an empty string. You don’t won’t two String
s pointing to the same heap-memory.
Naturally, stealing is not really an official name for what happens in steal_content
–it’s usually called moving. Ignoring the naming details, you can use this function in File
to avoid making a copy of the name
argument:
class File {
private:
String name_;
public:
File(String& name):
name_{ } // default initialize
{
name_.steal(name); // move the name object
}
};
But this has a nasty side-effect. Because the argument to the constructor became a non-const
lvalue reference (otherwise the moved-from object cannot be modified), it cannot be used as before:
File file{ "data.in" };
Only a const
reference can bind to a temporary String
object that’s implicitly created from “data.in”. Naturally, you can provide two constructor overloads, one taking a const String&
and one String&
. It could work in this case, but such an approach is not scalable to other scenarios. What if there is already a const
- non-const
overload set in the codebase. Perhaps a silly one, like in the example below:
class File {
private:
String name_;
public:
// throws on invalid file names
File(const String& name):
name_{ throw_if_invalid(name) }
{}
// tries to fix the file name if it's invalid
File(String& name):
name_{ try_repair(name) }
{}
};
A common solution to exploding overload sets is using a tag. A tag is a special, empty structure that’s added to the parameter list of a function to distinguish between overloads. Something like this suffices:
struct MovableTag{};
You could add support for MovableTag
to both String
and File
, leading to:
class String {
public:
// copy constructor
String(const String& other);
// move constructor with a tag
String(String& other, MovableTag):
length_{ other.length_ },
data_{ other.data_ }
{
other.length_ = 0;
other.data_ = new char[1]{};
}
};
class File {
public:
// throws on invalid file names
File(const String& name):
name_{ throw_if_invalid(name) }
{}
// tries to fix the file name if it's invalid
File(String& name):
name_{ try_repair(name) }
{}
// just moves name, tagged overload
File(String& name, MovableTag):
name_{ name, MovableTag{} }
{}
};
As a point of interest, a parameter of type MovableTag
is never named in function definitions. This has two advantages. First, there’s no compiler warning about an unused argument. Second, it’s a clear signal of intent to the compiler that it can optimize this parameter away because it’s only used for overload resolution.
With both File
and String
taking advantage of the tag, the code becomes:
#include <iostream>
int main() {
String name{ "data.in" };
File file{ name, MovableTag{} };
std::cout << name << '\n';
std::cout << file;
}
It’s a bit uglier, but it works just fine, printing an empty line (the stolen from name
object), followed by File{ name=’data.in’ }. Sadly, next to the subjective ugliness, it is impractical to implement a move assignment operator using tagging. In other words, making something along these lines work:
class File {
public:
String& operator=(String& other, MovableTag) {
if (this != &movable.obj) {
delete[] data_;
data_ = other.data_;
length_ = other.length_;
other.length_ = 0;
other.data_ = new char[1]{};
}
return *this;
}
};
Is not a trivial task. To make code more readable, and allow move assignment you’ll need:
Stealing by wrapping
The idea is straightforward, instead of using a tag object, you should use a wrapper to distinguish between the overloads. The wrapper must be generic, to support arbitrary types:
template <typename T>
struct Movable {
T& obj;
};
// deduction guide
template <typename T>
Movable(T) -> Movable<T>;
The wrapper has only one data member, an lvalue, non-const
reference to an object that can be moved from. Because Movable
is an aggregate, a deduction guide is added to enable writing code like:
String name{ "data.in" };
Movable movable{ name }; // deduction guide used to deduce T=String
You could even go a step further and supplement Movable
with a function template:
template <typename T>
Movable<T> as_movable(T& t) {
return {t};
}
/* ~~~ */
```c++
String name{ "data.in" };
auto movable = as_movable( name ); // deduction guide used to deduce T=String
With Movable
in place, you can refactor String
to support move construction and move assignment with:
class String {
private:
std::size_t length_ {0};
char *data_ { new char[1] { '\0' }};
public:
// copy constructor and assignment op.
String(const String& other);
String& operator=(const String& other);
// move constructor
String(Movable<String>& movable):
length_{ movable.obj.length_ },
data_{ movable.obj.data_ }
{
movable.obj.length_ = 0;
movable.obj.data_ = new char[1]{};
}
// move assignment operator
String& operator=(Movable<String>& movable) {
if (this != &movable.obj) {
delete[] data_;
data_ = movable.obj.data_;
length_ = movable.obj.length_;
movable.obj.length_ = 0;
movable.obj.data_ = new char[1]{};
}
return *this;
}
// destructor
~String() noexcept;
};
In this implementation, Movable<T>
is used to distinguish between the overloaded constructors and copy assignment operators. The move variants still steal the data from the String
passed as a wrapped argument to them. Movable
enabled writing the move assignment operator–something that was non-trivial with a tag.
You should also change File
to support the new String
construct:
class File {
private:
String name_;
public:
// copies String
File(const String& name):
name_{ name }
{}
// moves String
File(Movable<String>& name):
name_{ name }
{}
};
And finally, you’ll enjoy clean code with both copying and moving variants available:
#include <iostream>
int main() {
String input_name{ "data.in" };
String output_name{ "results.out" };
// moves input_name
auto movable{ as_movable(input_name) };
File input{ movable };
// copies output_name
File output{ output_name };
}
With a small caveat… Movable<String>
is taken by non-const
reference by the constructors. Consequently, it’s impossible to move objects created in-place:
#include <iostream>
int main() {
String input_name{ "data.in" };
String output_name{ "results.out" };
// moves input_name, OK
auto movable{ as_movable(input_name) };
File input{ movable };
// moves output_name, ERROR
File output{ as_movable(output_name) };
}
The snippet of code above produces an error saying that you cannot bind a non-const
lvalue reference to a temporary object created with as_movable
. You could, potentially, solve this problem by changing the signatures of the constructors to:
String::String( const Movable<String>& movable);
File::File( const Movable<String>& movable);
And, shockingly, it would work! This trick is exploiting one of the dark corners of C++. The obj
data member of Movable
must obey the Movable
’s constness. However the obj
’s referent (the String
object that obj
refers to) does not belong to Movable
and is not concerned by whatever happens to it. Consequently, despite Movable
being passed by const
reference, you can modify the String
referenced by obj
and move the data that belongs to it.
The code works as intended and expresses the intent of a programmer in a clean way (as_movable
). Yet, it feels like a fraud–an object that can be moved from is passed (indirectly) by const
reference. Writing code like this is putting your future self in trouble. There must be a better way, and it came in C++11 with…
Rvalue references
To understand how rvalue references work, it’s enough to replace each Movable<String>&
with String&&
(skipping the const
qualifier) and each as_movable
with std::move
. After making those changes and necessary adjustments, String
becomes:
class String {
private:
std::size_t length_ {0};
char *data_ { new char[1] { '\0' }};
public:
// copy constructor and assignment op.
String(const String& other);
String& operator=(const String& other);
// move constructor
String(String&& movable):
length_{ movable.length_ },
data_{ movable.data_ }
{
movable.length_ = 0;
movable.data_ = new char[1]{};
}
// move assignment operator
String& operator=(String&& movable) {
if (this != &movable) {
delete[] data_;
data_ = movable.data_;
length_ = movable.length_;
movable.length_ = 0;
movable.data_ = new char[1]{};
}
return *this;
}
// destructor
~String() noexcept;
};
And the File
class is:
class File {
private:
String name_;
public:
// copies String
File(const String& name):
name_{ name }
{}
// moves String
File(String&& name):
name_{ std::move(name) } // <- notice std::move
{}
};
Notice a small addition–there’s now a std::move
when initializing name_
in the second File
’s constructor. That’s because of the language rules dictating that rvalue referenceness doesn’t propagate. So an object that’s passed by an rvalue reference (name
) stops being one when passed further down the line. It needs to be marked as an rvalue reference again, and that’s the job of std::move
. Yes, you read it correctly, std::move
doesn’t move anything–it just marks something as an rvalue reference, similarly to how as_movable
marked something as an object potentially movable from by wrapping it into Movable
.
With those changes, the main
part of the code becomes:
int main() {
String input_name{ "data.in" };
String output_name{ "results.out" };
// copies input_name
File input{ input_name };
// moves output_name
File output{ std::move(output_name) };
}
And voila! the program works as intended.
Can you spot the difference between using rvalue references and the home-cooked solution? You can mark an object as movable with std::move
in-place when passing it to a function. With as_movable
it was impossible because a non-const
lvalue reference cannot bind to a temporary object. With rvalue references the issue isn’t there–they are made specifically to bind to both objects marked with std::move
and to temporaries. So even the code below, where an rvalue Strings
reference binds directly to a temporary returned by a function works:
String get_name();
int main() {
File file{ get_name() };
}
Summary
What you’ve learned? Hopefully quite a handful:
- Moving is actually “stealing” dynamic resources owned by an object and placing them into another one.
- Moving is possible without rvalue references.
- Tagging and wrapping help with overload resolution.
- Objects that are tagged as movable can be moved-from.
- In modern C++, an rvalue reference is used to “tag” and object as movable.
std::move
marks an object as an rvalue reference.std::move
doesn’t move anything, it’s actually closer in its meaning toas_movable
.