L/R/U References and Move Semantics
05 Apr 2020- Move Semantics
- Values Types in terms of ‘move-ability’
- References
- Perfect Forwarding
- Best Practices
- Credits and Sources
Move semantics were introduced in C++11
to provided a standard way to manage memory using move (instead of copy).
Benefits:
- Avoid unnecessary copies in memory. Another technique that serves similar purpose is RVO (link to post);
- Resource (memory) management.
The most notable example is
unique_ptr
.
This post gathers my study notes on this topic.
Move Semantics
At first glance, std::move(object)
is the most notable semantic changes in this context.
It is important to know what exactly it does, and more importantly, what it does not do.
A good understanding of std::move(object)
facilitates writing examples tests, so we’ll start with that.
Before jumping to the conclusion, let’s take a quick look at the gcc source code:
// bits/move.h
/**
* @brief Convert a value to an rvalue.
* @param __t A thing of arbitrary type.
* @return The parameter cast to an rvalue-reference to allow moving it.
*/
template<typename _Tp>
constexpr typename std::remove_reference<_Tp>::type&&
move(_Tp&& __t) noexcept
{ return static_cast<typename std::remove_reference<_Tp>::type&&>(__t); }
// traits
template<typename _Tp>
struct remove_reference
{ typedef _Tp type; };
template<typename _Tp>
struct remove_reference<_Tp&>
{ typedef _Tp type; };
template<typename _Tp>
struct remove_reference<_Tp&&>
{ typedef _Tp type; };
std::move(object)
does exactly one thing - produce a rvalue of the input.
It does so in two steps: first, remove_reference
cast, to remove reference from the type; then add right-reference to the return type.
The remove_reference
is required because of reference collapse (a complication in generating the correct reference type when dealing with references).
IT DOES NOT move anything around.
The actual move happens in the implementations that explicitly takes rvalue references.
For example std::vector
’s move constructor and move assignment ‘steal’ the data of its source and leaves its source in a ‘valid but unspecified state’.
Why is it named move()
then if it does not ‘move’? For example, Scott Meyers in item 23 of Effective Modern C++ mentions std::move(object)
functions more like rvalue_cast
.
Speculatively, I think it might have been a syntax sugar to remind the user that the object has been moved. After all, from user’s point of view, there isn’t any other indicator of moving:
std::unique_ptr<T> p1 = std::make_unique<T>();
std::unique_ptr<T> p2(std::move(p1)); // hinting 'move'
std::unique_ptr<T> p3(std::rvalue_cast(p1)); // no signal of p1's content 'moved' at first glance
Values Types in terms of ‘move-ability’
In the first section, we discussed about what std::move(object)
does - it is essentially a rvalue_cast
.
rvalue determines the ‘move-ability’ of the expression (the concept is broader than objects as it includes the results of expressions on objects, which can be seen as temporarily objects).
Definitions
A few informal definitions that facilitate us understanding the concepts:
-
lvalue and rvalue originated from being on which the side of the assignment operator
=
. This definition predatesC++11
(and possiblyC++
itself). In this definition, expressions that can be on the left side of the assignment are lvalues, while all other expressions are rvalues; - Another definitions involves getting its address using address of operator
&
. lvalues are the ones whose address can be taken using&
; - Another definition is rvalues are values that are ‘moveable’. This is an accurate definition of rvalues.
These two definitions portraits the main story - lvalues are the expressions that have an address, and therefore can be on the left side of the assignment to receive a value, and all other expressions that do not fall in this category are rvalues. The full story includes 3 other types (glvalue, prvalue and xvalue):
A final note on value types - rvalue, lvalue are independent of constness as they represent different characteristics of the expressions although constness does play a role in later section about reference binding.
Compilation VS Language Abstraction
It is important to note that these concepts are language abstractions instead of compilation rules.
For example, if we compile the below statement with -O0
int i = 3;
we have
mov w0 3 ; set w0 to 3
str w0 [sp, 12] ; store w0 in function stack
The compilation result indicates that the expression “3” does not have address, but on language abstraction level it does. In fact, if we do
int &&i = 3;
the compiler generates
mov w0 3
str w0 [sp, 4]
add x0 sp, 4 ; get w0 address
str x0 [sp, 8] ; store w0 address in function stack
So compiler always generate the minimum required for the code. We can still use compiler output to do analysis, but we should bear in mind when we discuss about the concepts in this post, we are discussing them in terms of C++ language abstraction.
References
C++11 expands on the definition of the references.
Types
- Lvalue Reference: the original reference.
It practically means creating an alias to an existing object.
It is usually denoted by a
&
, and it binds only to lvalues; - Rvalue Reference: the new reference.
It is a special type of alias, usually indicating the object it aliases to is ‘moveable’.
It is usually denoted by a
&&
, and it binds only to rvalues. I see it as a ‘deeper’ form of reference where we can not only update the original object, but also ‘steal’ its content; - Universal Reference: binds to anything.
Similar to rvalue reference, it is also denoted by a
&&
. But for a reference to be universal, it must involve direct type deduction on the type itself. Two forms of universal reference:// case 1 template<typename T> void foo(T&& t) {} // T is deducted from argument passed to 't' // case 2 auto && v1 = v2; // v1's type is deducted from v2's type
Universal Reference works as either Lvalue Reference or Rvalue Reference depending on the bound object type.
Reference Binding
Universal References binds to anything, but what about the other two? What role does constness play in this? Let’s see it in an example:
using V = std::vector<int>;
void value (V v) { ... }
void lref (V &v) { ... }
void rref (V &&v) { ... }
void const_value (const V v) { ... }
void const_lref (const V &v) { ... }
void const_rref (const V &&v) { ... }
int main() {
V var; const V const_var;
...
In the above setup, we create functions using a combination of pass by value, pass by lvalue reference and pass by rvalue reference.
We initialized two variables: non-const variable var
and const variable const_var
(both are lvalues).
To obtain their corresponding rvalues, we use std::move(object)
.
value(var); // copy ctor called to create parameter
value(const_var); // copy ctor called to create parameter
value(std::move(var)); // move ctor called to create parameter
value(std::move(const_var)); // copy ctor called to create parameter
const_value(var); // copy ctor called to create parameter
const_value(const_var); // copy ctor called to create parameter
const_value(std::move(var)); // move ctor called to create parameter
const_value(std::move(const_var)); // copy ctor called to create parameter
Not a whole lot to see up there:
- Pass-by-value accepts both types of references;
- move constructor is called for non-const rvalues, and copy constructor for everything else.
lref(var);
//lref(const_var); // const qualifier discarded
//lref(std::move(var)); // cannot bind non-const lvalue reference to rvalue
//lref(std::move(const_var)); // cannot bind lvalue ref to rvalue
const_lref(var);
const_lref(const_var);
const_lref(std::move(var));
const_lref(std::move(const_var));
Lvalue references bind to non-const lvalues. Const lvalue references bind to everything.
//rref(var); //cannot bind rvalue reference to lvalue
//rref(const_var); // cannot bind rvalue reference to lvalue
rref(std::move(var));
//rref(std::move(const_var)); // const qualifier
//const_rref(var);
//const_rref(const_var); // cannot bind rvalue reference
const_rref(std::move(var));
const_rref(std::move(const_var));
Rvalue references bind to non-const right values. Const rvalue references bind to rvalues.
When a function call match to a non reference function signature and reference type function signature, a compile error raises complaining about ambiguity on function call:
error: call of overloaded func_name(value_type) is ambiguous
However, when multiple reference function signatures are matched, there is an implicit binding preference (lower number means higher preference when matched):
val-type\ref-type | lref | const-lref | rref | const-rref |
---|---|---|---|---|
lvalue | 1 | 2 | - | - |
const lvalue | - | 1 | - | - |
rvalue | - | 1 | 3 | 2 |
const rvalue | - | 2 | - | 1 |
Conclusion is:
- rvalue prefers rvalue reference;
- Non-const is preferred over const.
Reference Collapse and Universal Reference
An important part of the design of rvalue reference is Reference Collapse. Bear in mind that reference chaining is illegal for user operations, but they are produced in certain contexts.
Reference Collapse rules that two references result into an lvalue reference if either reference is lvalue reference, and into an rvalue reference only if both references are rvalue reference.
This is the reason we need to apply std::remove_reference
to std::move(object)
’s result: it blocks us from converting an lvalue to an rvalue using only rvalue casting.
Universal Reference is the outcome of Reference Collapse.
Perfect Forwarding
Let us also start with gcc source code for std::forward(object)
:
// bits/move.h
/* @brief Forward an lvalue. */
template<typename _Tp>
constexpr _Tp&&
forward(typename std::remove_reference<_Tp>::type& __t) noexcept
{ return static_cast<_Tp&&>(__t); }
/* @brief Forward an rvalue. */
template<typename _Tp>
constexpr _Tp&&
forward(typename std::remove_reference<_Tp>::type&& __t) noexcept
{ static_assert(...); return static_cast<_Tp&&>(__t); }
// Note with Reference Collapse we can unify these two function if
// ignoring static_assert():
template<typename _Tp>
constexpr _Tp&&
forward(typename std::remove_reference<_Tp>::type&& __t) noexcept
{ return static_cast<_Tp&&>(__t); }
Given these facts, in order to forward an rvalue, _Tp
needs to be the non-reference type.
Suppose T
is the non-reference type, we have
std::forward<T>(rvalue); // can't be 'T&' because return type 'T & &&' collapse into 'T&'
To forward an lvalue, _Tp
needs to be the left value reference type, i.e. T&
:
std::forward<T&>(lvalue); // can't be 'T' because it would result into 'T &&'
std::forward(object)
is designed to use with Universal Reference. Universal Reference template deduction rules that the deduced type is a lvalue reference if passed-in argument is lvalue type and non-reference type if passed-in argument is rvalue, which perfectly matches the implementations of std::forward(object)
.
std::forward(object)
does what its name suggests - it forwards lvalue/rvalue depending on the ARGUMENT type (not to be confused with parameter. In fact, it is precisely designed to deal with the reality that regardless of argument type, the parameter is always a lvalue by definition).
Best Practices
Conclusion time!
-
Use
std::forward(object)
for Universal Reference as it is designed for this exact purpose; usestd::move(object)
to pass rvalue from rvalue reference or when explicitly moving from object; Although, with some tweaks, the two functions can be used interchangeably,std::move(object)
vsstd::forward(object)
serve very different purpose. Following this rule also improves consistency and readability: - Don’t make objects const if planned to move them. It doesn’t make sense to declare objects we plan to modify along the way as const in the first place. More dangerously, these move requests are silently transformed into copies if copy signature exists (item 23, page 160);
- Delete copy constructors if not required. This helps mitigate misuses in #2 with user defined types by preventing copy constructor getting called in place of move constructor;
-
Use universal reference instead of overloading lvalue reference and rvalue reference for the same purpose.
The benefits are 1. code readability and maintainbility (less code); 2. scalability (number of overloads increases exponentially with the increase number of parameters); 3. performance improvement (item 25, page 171).
// prefer universal reference template <typename T> void foo(T&& t) { bar(std::forward(t)); } // overly overloading void foo(const std::string& s) { bar(s); } void foo(const std::string&& s) { bar(std::move(s)); }
-
Call
std::move(object)
orstd::forward(object)
only when not planning to further use the object as they might be in invalid state; -
Apply
std::move(object)
orstd::forward(object)
to the return value if it is bound to a rvalue reference or universal reference, respectively. It triggers a move operation if return value is passed to a function consuming rvalue (and performs move operation; and if universal forwarded value is an rvalue); -
Follow RVO guidelines when returning local variables.
C++ standard dictates that the compiler shall perform
std::move(object)
if applicable (item 23, page 160). - Avoid overloading on Universal References, especially avoid declaring a constructor with Universal References as implicit overloading rules could result into unwanted behavior, and implicit constructor don’t work well with universal referenced constructor (item 26, item 27).
-
auto &&
has the same effect as universal reference. They (also,typedef
,decltype
) are all a result of Reference Collapsing; -
Move operation is most effective on heap allocated objects. E.g., move operation on
std::array
is linear to its size (albeit faster than copy) and on smallstd::string
(item 29). - As for C++17, overloading
.
(dot) operator is yet to be allowed, although some discussion is on-going. Currently it results into same l/r-ness as its host type:void func (V &v) { std::cout << "by lref" << std::endl;} void func (V &&v) { std::cout << "by rref" << std::endl;} T t; func(t.v); // "by lref" func(std::move(t).v); // "by rref"
Credits and Sources
Library source code snippets are from gcc7.5. Most of the topics listed in this post is regurgitating Scott Meyers’s Effective Modern C++. This post is my reading log of its Chapter 5.