C++ Type Conversion and Casts

c++
 
types
 
oop
 

Continuing on the story of C++ Types, we explore the conversion between types and C++ cast operator. We discuss the differences on implicit conversions and explicit conversions, and further elaborate on each cast operator.

Inherited from C language, there are two scenarios where type conversions happen: implicit conversion and explicit conversion (also known as C Style Cast). Modern C++ provides four additional cast operators for explicit type conversions: const_cast, static_cast, reinterpret_cast and dynamic_cast.

Implicit Conversion

Implicit type conversions refers to the situations where one type is expected but an expression of another type is provided, and that the compiler determines the code is well-formed. Examples scenarios include

  1. function parameter, including operator parameter;
  2. object initialization, including return object;
  3. expressions, e.g. if expects bool, but sometimes we provide integral, pointers, etc.

CPP Reference details the laws of legal implicit conversions. Here we have an abridged and overly simplified version of the rules of legal implicit conversions: an implicit conversion is legal when there are a limited number of standard conversions (e.g. lvalue-to-rvalue, array/function-to-pointer, cv-qualifier-adjustment, upcast polymorphic type) and/or user-defined conversion (at most one, non-explicit, user-defined conversion function or single parameter constructor) that converts the input type to the target type.

explicit keyword

A simple example to illustrate the basics of implicit conversion and explicit keyword:

struct ExprT3 {}; struct ExprT4 {};
struct T
{
  T() {};
  /* Converting Constructor */
  T(ExprT3) {}
  explicit T(ExprT4) {}
};
/* User-defined Conversion Function */
struct ExprT1 { operator T() { return T();} };
struct ExprT2 { explicit operator T() { return T();} };
/*
 * A function taking 'T' as its parameter.
 * An implicit conversion is triggered when the
 * argument can legally convert to 'T' implicitly.
*/
void func(T) {}
int main() {
  ExprT1 expr1; ExprT2 expr2; ExprT3 expr3; ExprT4 expr4;
  func(expr1);
  func(expr3);
  /* 'explicit' keyword prevent implicit conversion */
  //func(expr2); // error: could not convert ‘expr2’ from ‘ExprT2’ to ‘T’
  //func(expr4); // error: could not convert ‘expr4’ from ‘ExprT4’ to ‘T’
}

Implicit conversion allows flexible and concise syntax. On the flip side, it also introduces new ways of creating bugs with unintended conversions, and these are fairly difficult to spot. The Committee decided that the downside was significant enough and introduced the keyword explicit way back in C++98.

Explicit Conversion

The explicit conversion that C++ inherited from C, also known as C-style Cast, wasn’t flawless either. It shares similar issues with the implicit conversion mechanism: it simplifies the syntax of the language, but it does a little bit too much sometimes and makes its intention a little unclear and bugs hard to spot. The Committee continued the effort in C++11 and introduced four different type casting operators.

Const Cast

Let’s start with the simplest form of casting const_cast. Its name is not too far away from the truth - const_cast are meant to deal with modifying cv-qualification:

  • It is the only cast operator allowed to remove cv-qualifier;
  • It cannot modify the base types (the remaining after we take out the qualifiers) - the types of the input expression and the output result must be the same after disregarding cv-qualifiers.

Key points:

  • const_cast results into either a pointer or a reference. It’s pointless to return by value - getting a copy of the object with a different cv-qualifier is trivial, and that is not the purpose of const_cast. const_cast is used to access the exact object with a different cv-qualified handle;
    int var1 = 0;
    const int& var2 = const_cast<const int&>(var1);
    std::cout << (&var1 == &var2) << std::endl; // '1'
    
  • const_cast does not modify the (cv-qualifier of the) underlying object, it provides a differently cv-qualified handle. Attempting to modify a const object with a non-const reference/pointer obtained with const_cast leads to undefined behavior. Most use cases of const_cast involves either 1. dealing with cv-incorrect legacy APIs or 2. address a flaw in the design, so we should avoid it if possible.
    const int c_var = 5; int var = 5;
    std::cout << c_var << " " << var << std::endl; // initial values: '5 5'
    const int& cref_to_var = var;
    int *test1 = const_cast<int*>(&c_var);
    int *test2 = const_cast<int*>(&cref_to_var);
    *test1 = 10; *test2 = 10;
    std::cout << c_var << " " << var << std::endl; // results: '5 10'. First case led to UB.
    
  • When converting to references, we can
    class A;
    /* lvalue can be converted to lvalue or rvalue references  */
    int& test1 = const_cast<int&>(var);   // lvalue to l-ref; same works for class type
    int&& test2 = const_cast<int&&>(var); // lvalue to r-ref; same works for class type
    /* prvalues: restriction on built-in types to allow some compiler optimization */
    //int&& test5 = const_cast<int&&>(1);            // prvalue of built-in not allowed
    A&& test6 = const_cast<A&&>(A());                // prvalue of class type allowed
    /* xvalue can be converted to rvalue references */
    int&& test8 = const_cast<int&&>(std::move(var)); //xvalue of built-in
    A&& test8 = const_cast<A&&>(std::move(A()));     // xvalue of class
    

    as says CPP Reference:

    lvalue of any type T may be converted to a lvalue or rvalue reference to the same type T, more or less cv-qualified. Likewise, a prvalue of class type or an xvalue of any type may be converted to a more or less cv-qualified rvalue reference.

  • When converting to pointers, we can be quite liberate on cv-qualifiers, and modify the cv-qualifiers on every and all levels:
    int const * volatile * const p = nullptr;
    int** pCast = const_cast<int**>(p);
    
  • const_cast provides hints to compiler but itself doesn’t generate any instructions, so it has zero impact on run time.

Static Cast

The second cast we discuss is static_cast. In fact, this is the cast operator we should consider by default. static_cast’s name stems from the definitions of dynamic polymorphism , where dynamic is used to refer run-time actions and static for compile time ones.

  • In general, static_cast allows casting between types that their conversions are well-defined. Many very different conversions are allowed with static_cast, but that shouldn’t cloud us from the bigger picture that static_cast is restricted with conversions that ‘naturally makes sense’ to the users.

Key points:

  • static_cast is not limited to pointers and references. It stands out from other casting operators, where in those cases, casting usually signifies interpreting the input object differently. In these type-to-type conversions, static_cast creates temporarily objects of output type with standard conversions, conversion operators or constructors. Using the same examples from implicit conversions:
    /* See previous definitions */
    T t1 = static_cast<T>(expr1); // conversion operator 
    T t2 = static_cast<T>(expr2); // conversion operator
    T t3 = static_cast<T>(expr3); // constructor
    T t4 = static_cast<T>(expr4); // constructor
    

    Standard conversions:

    int i = static_cast<i>(3.14);    // precision lost: i == 3 
    float f = static_cast<float>(i); // promotion     : f == 3.0
    

    And it is still subject to implicit conversion rules:

    struct A {};              struct B {B(A a) {}};
    struct C {C(B b) {}};     struct D {D(C c) {}};
    A a;
    B b = static_cast<B>(a);       // okay, A->B through B's constructor
    C c = static_cast<C>(b);       // okay, B->C through C's constructor
    D d = static_cast<D>(c);       // okay, C->D through D's constructor
    D dd = static_cast<D>(bbb);    // okay, B->D added implicit conversion B->C
    //D dddd = static_cast<D>(aa); // not okay, conversion invalid
    
  • static_cast potentially has run time cost. This is probably a repetition of the first point, but it’s worth mentioning. As seen in the first point, there are potentially (expensive) user defined function calls incurred from static_cast.
  • static_cast conversions usually ‘makes sense’ but is not always safe. A simple example is
    int i = static_cast<int>(1e10); // Undefined Behavior
    
  • static_cast can be used for some casting through inheritance. Casting objects directly involves Object Slicing. In this post, we will focus on casting reference or pointers, e.g. static_cast<Base*>(derived):
    • Both upcasts and downcasts are allowed. Sidecasts are not;
    • Both polymorphic, i.e. inheritance where virtual methods are present (also i.e objects have vtable pointer in its memory), and non-polymorphic casts are allowed;
    • Casting multiple inheritance is allowed. The offset will be correctly applied by the compiler based on class memory layout information available at the compile time;
    • Casting virtual inheritance is allowed upwards but not downwards. We will focus on the casting operators in this post. Another post for reasons.
    • To further elaborate on the previous point about static_cast’s unsafeness, static_cast does not guarantee safety when casting through inheritance. See Dynamic Cast.

CPP Reference lists a total of ten scenarios for static_cast. Other than the ones we listed above, there are a few notable use cases:

  • Item 3 - rvalue-reference to xvalue conversion static_cast<T&&>(t), which both std::move() and std::forward() rely on;
  • Item 4 - conversion to void type;
  • Item 10 - conversion from pointer to any type T to/from pointer to void, which allows static_cast to behave like reinterpret_cast:
    static_cast<T2*>(static_cast<void*>(&t1)); // T1 t1;
    

Reinterpret Cast

If static_cast resembles Americans’ Left Party -

  • common sense is applied: conversions are generally safe and oftentimes they make sense from a user’s perspective;
  • safety is not guaranteed and loop holes exist (through a middle man void*, a pointer could be converted to another pointer of arbitrary type).

then reinterpret_cast is comparable to Americans’ Right Wings - freedom without consideration of any circumstances.

It is mainly used to reinterpret a given memory as any type that the user desires. A few important characteristics of this cast:

  • No runtime cost. Similar to const_cast, it acts as a guideline to the compiler. No new objects are created, so unlike static_cast, it incurs no extra cost in constructors, conversion operator functions, etc;

  • It mainly operates on pointers/references. It can be use to convert a pointer/reference from an arbitrary type T1 to a pointer/reference to an arbitrary type T2. Additionally, it can also be used to convert pointers to/from integrals.

C Style Cast

C style cast is the explicit type conversion in C. In C++, it is a combination of some explicit casting operations above mentioned.

  • Both (T2)t1 and T2(t1) indicate a C style cast. They are equivalent, and the parentheses are mainly to group things together, e.g. underlying type with type qualifier, expressions, etc.
  • It does one or more of const_cast, static_cast and reinterpret_cast. Specifically, the following are attempted in order:
    1. const_cast
    2. static_cast
    3. static_cast followed by const_cast
    4. reinterpret_cast
    5. reinterpret_cast followed by const_cast

    Examples:

      /* const_cast */
      const int const_i = 5;
      auto i_p = (int*) const_i;
      /* static_cast */
      Derived drv;
      auto base_p = (Base*) &drv;
      /* static_cast + const_cast */
      const Derived const_drv;
      auto base_p2 = (Base*)&const_drv;
      /* reinterpret_cast */
      float f2 = 5.6;
      auto i_p2 = (int*) &f2;
      /* reinterpret_cast + const_cast */
      const float f3 = 5.6;
      auto i_p3 = (int*) &f3;
    

C style cast provides the backwards compatibility with C Language, and the flexibility of a range of casting using a single expression. Its disadvantages are pretty obvious too:

  • It is difficult to interpret the user’s intention, as the same expression could have a variaty of meanings, and that could be troublesome especially when troubleshooting issues.
  • Not search-friendly. As parentheses can be used in anything ranging from function calls, function definitions to grouping elements, basic statements, it near impossible to search for casting operations.

Dynamic Cast

Lastly, dynamic_cast safely (down)casts polymorphic types. Key points:

  • It occurs run-time cost. static_cast sometimes incurs run-time cost too, but they are usually required cost for converting types. dynamic_cast’s run-time cost, however, is due to additional safety check to guarantee cast safety;
  • It works only with pointers or references. For pointers, a failed casting operation returns a nullptr; for references, a failed casting operation a std::bad_cast is thrown;
  • Although unnecessary, dynamic_cast can be used for up-casting. As upcasting is always safe, no additional type safety check is done, and it is exactly the same as static_cast in these cases. Additionally, in these cases, the types are not required to be polymorphic;
  • The underlying worker function that performs safety check __dynamic_cast requires the types to be polymorphic. That is, when the compiler does not have enough information to validate the dynamic_cast, it allows conversions between (pointers/references of) arbitrary types, and the validation will be done at the run time (__dynamic_cast) by checking the object’s dynamic type, i.e. the type_info in its virtual table, and only polymorphic types have virtual table;
  • A special use case of dynamic_cast is casting a polymorphic type to void* yields a void pointer to the most derived type. Example use case: we can examine if two pointers refer to the same object in a multiple inheritance scenario by casting them to void* and comparing their addresses.

Final Notes

Rule of Thumb

  • Prefer explicit conversions to implicit conversions. Prefer C++ style castings to C style casting;
  • const_cast should generally be avoided, unless dealing with legacy API and it’s absolutely necessary to cast away const/volatile qualifiers;
  • When casting, by default, choose static_cast as it handles the most expected conversions;
  • Use reinterpret_cast for low-level bit manipulation only;
  • An absolute need of dynamic_cast might imply a bad design. A better solution is to avoid logics with undeterministic types, and use static_cast for better performance.

References