Authors: Tan Jun An
C++ is a general-purpose programming language designed to provide high performance and efficiency for resource-constrained and large systems. The language has since been extended and improved upon, with new standards being released periodically. One such standard, the C++11
standard, improved the performance of the language with features such as Rvalue References and Move Semantics.
In order to better understand the benefits of using Move Semantics, it is important to first understand the two other modes already available in C++, Value Semantics and References Semantics.
Value (or copy) semantics is the programming style where users are only concerned about the values stored in the objects, rather than the object itself. As such, an extra copy of the object will always be created whenever it is passed to a function, (also known as pass-by-value) or when constructing or assigning a new object. This ensures that each object declared (or function) will have their own copied value to use, without having to concern themselves with their originator. By default, C++ uses this mode if variables are declared with only the data type.
Some advantages of Value Semantics include:
However, Value Semantics has one major flaw:
Reference (or pointer) semantics is another choice available for use in C++, which allows users to declare pointers and references that point to the address of the object. As such, we can pass these reference address to functions and declaration of other pointers, and all of them will refer to and use the same object and address.
Some advantages of Reference Semantics include:
However, Reference Semantics also comes with its own problems, such as:
From the summary above, there are many benefits for using Value Semantics. However, its one major flaw is that a copy of the object will always be created, which can be computationally expensive if this object is of a large size. Reference Semantics may also not be preferred due to its different problems as discussed above. As such, in order to continue gaining the benefits Value Semantics have over Reference Semantics while overcoming it's major flaw, C++11
introduced a new mode to users, Move Semantics.
To understand how Move Semantics work in C++, it is important to distinguish between an rvalue and an lvalue.
lvalue = rvalue
From the line above, lvalue (left value) basically refers to values that are addressable, while rvalue (right value) are temporary objects or values which are used only on the right side of an assignment expression. More details and classification of these 2 values can be found here.
Rvalue References allows us to distinguish an lvalue from an rvalue. In C++11
, we can declare an Rvalue Reference using the &&
operator:
int &&rvalue = 55;
We can also convert an lvalue to a rvalue using the std::move
function:
int lvalue = 99;
int &&rvalue2 = std::move(lvalue);
We can also do function overloading in order to determine whether the parameters given are lvalues or rvalues:
void print(int& lvalue) { // takes an lvalue
std::cout << "lvalue method used";
}
void print(int&& rvalue) { // takes a rvalue
std::cout << "rvalue method used";
}
int main() {
int x = 5;
print(x); // lvalue method used
print(10); // rvalue method used.
}
Usage of function overloading with Rvalue Reference parameters, which take on temporary objects, helps in writing more efficient programs using Move Semantics!
The main usage of Rvalue References is that it allows us to create move constructor and move assignments, instead of copy constructor and copy assignments by default. Since rvalues are typically temporary objects, we can just move the value instead of copying it, thus reducing memory consumption and improving performance!
One example implementation of a move constructor and move assignment is shown below:
Foo(Foo&& other) { // move constructor
x = other.x;
y = other.y;
z = other.z;
}
Foo& operator=(Foo&& other) { // move assignment
x = std::move(other.x);
y = std::move(other.y);
z = std::move(other.z);
return *this;
}
In this move constructor and assignment, the contents of the other
parameter is moved into the object, and the contents in other
is destroyed afterwards. No additional memory allocation is required, and the move operation is done quickly by a few assignment of address operations, leading to a faster and more memory efficient program.
Along with Move Semantics in C++11
, the STL library provides the overloaded move functions for its container classes (e.g. vector
, list
, set
), thus we can take advantage of these Move Semantics by simply supplying rvalues, without the need to redefine the classes ourselves.
After learning about the new Rvalue References and Move Semantics in C++11
, many programmers from the older C++ eras may fall into a trap of the Rvalue Reference Anti-Pattern.
Consider these 2 class constructors:
Foo(const std::string& x, const std::string& y) { // copy constructor
_x = x;
_y = y;
}
Foo(std::string&& x, std::string&& y) { // move constructor
_x = x;
_y = y;
}
As shown above, the move constructor defines each of its parameter to be Rvalue References. In this case, if the constructor is called using a mixture of both lvalues and rvalues, such as lvalue for x
and rvalue for y
, the copy constructor instead will be called! This is because the copy constructor is able to accept both lvalues and rvalues, while the move constructor is only able to accept rvalues. As such, we may think we have made use of Move Semantics to optimize our program, but that may not be the case!
One solution to this problem would be to overload the constructor for each combination of Rvalue Reference parameters possible, like this:
Foo(const std::string& x, const std::string& y) { // copy constructor
_x = x;
_y = y;
}
Foo(std::string&& x, const std::string& y) { // move constructor 1
_x = x;
_y = y;
}
Foo(const std::string& x, std::string&& y) { // move constructor 2
_x = x;
_y = y;
}
Foo(std::string&& x, std::string&& y) { // move constructor 3
_x = x;
_y = y;
}
However, this is not feasible as we would require 2^n
overload functions, where n
is the number of parameters. This results increased boilerplate code, which in turn reduces code quality, increases memory consumption and compilation time.
Rather, what should be provided here is:
Foo(std::string x, std::string y) { // move constructor
_x = std::move(x);
_y = std::move(y);
}
Yes, revert back to using the old Value Semantics type constructor instead! By doing so, it is now up to the caller to decide whether they want to have an additional copy by calling this constructor with Foo(x,y)
, or to prevent the additional copy by calling Foo(std::move(x), std::move(y))
, depending on which value is no longer needed.
The following resources gives more readings on what was discussed, and a more in-depth tutorial on Rvalue References and Move Semantics: