Little wonders of C++ (2)

C++ is transforming into a better language, that is even more powerful than before and much easier and robust to write.

In the last part we had a short look at new keywords like final or override. We also introduced shared_ptr and vector, and had a look at the new unified and move constructors.

This time we will look at more language features, with most features being C++11 related.

C++ casts

Of course one may use C-style casts in C++, such as (T)foo. However, as most casting is working by using a constructor call, the functional cast T(foo) should be preferred. For primitive types, such as int or double this does not make much sense, hence another form would be more informative.

double d = 3.0;
int number = static_cast<int>(d);

We should use the static cast to transform between elementary types at runtime. The transformation is placed by the compiler (statically), and supported on an instruction level. On the other hand if we just want to interpret the value of a variable in another format, we might want to use reinterpret_cast. So if we want to see the exact byte value of a specific single precision floating point value, shown as an integer, we could use the follwing code:

float f = 1.3;
int representation = *reinterpret_cast<int*>(&f);

In C++ reinterpret_cast can only perform a specific set of conversions, explicitly listed in the language specification. In short, reinterpret_cast can only perform pointer-to-pointer conversions and reference-to-reference conversions (plus pointer-to-integer and integer-to-pointer conversions). This is consistent with the intent expressed in the very name of the cast: it is intended to be used for pointer/reference reinterpretation.

While the two casts result in machine instructions, they are practically transparent. On the other hand, we also have a compiler-only cast with the const_cast. This can be used to cast away constness of a variable. We should be very cautious with it, as const is not only very useful, but usually set with a desired behavior in mind. So there is a chance that functions, which take non-const parameters, might mutate the state of these parameters.

Finally there is also a runtime cast, which is very powerful, but also non-transparent. If we want to cast from one class to another (from general to specialized class) we could use dynamic_cast. Needless to say that such casts should be used with great caution, as casting can be indicator of design problems.

What is the actual benefit of using C++ casts? First, we are able to be a lot more specific. Second, finding such casts is much easier (e.g. by using grep or similar programs). Finally, sometimes these are the only casts for solving a specific problem (e.g. dynamic_cast for checking if an instance is of a certain type).

The auto keyword

Inferring type is a strong feature of C#. Actually C# was not the first language to include this feature, but it certainly helped to emphasize how important this feature is. Why is using type inference so crucial?

Refactoring is much easier
Types need to be specified less often
No accidental casting

So the main reason is to improve the refactor-ability. If we change the return type or type name of a function, but do not change the main functionality (used operators, method names, ...), then our code would work without changes in any function, which invoke the changed function.

In C++11 the statement above is only valid if such functions do not return the value that has been returned from the invoked function call. The reason is simple: Then we would be required to state the name of the type. However, as the name of the type changed, we need to change it here as well. In C++14 we have auto even for return types. Here the statement is true in all scenarios.

We will see that auto is also very handy in combination with the range based for loop, and with lambda expressions.

Consistency with nullptr

In C there is nice little macro called NULL. This is actually just a long way to write 0. Usually it is used to set pointers to a consistent address, which is invalid and easy to check. Since 0 is an integer, and pointers are different types, C used some implicit casting for comparison. Nothing wrong so far.

However, there are implications that might be unwanted. On the one hand, C++ allows function overloads. That way one could accidently use an overload that uses the integer, when we specify NULL.

Another example would be perfect forwarding. Finally type inference with NULL does not work (unless you want to obtain an integer), so nullptr_t might be the preferred solution. Additionally the nullptr also allows partial specialization, when something is called explicitely with a nullptr at compile-time.

//primary template
template<typename T, T *ptr>
struct Something {
	/* ... */
};

//partial specialization for nullptr
template<>
struct Something<nullptr_t, nullptr> {
	/* ... */
};

In practice we should replace all NULL expressions by nullptr. If code breaks, then further investigation is not only required, but very useful as well.

The const keyword

Constness is great! Whenever we are in doubt about the state of a variable (field, local or parameter), we should declare it as being const. The same is true for methods of a class. If it does not change the internal state, or we are not sure about it changing the internal state of the instance, we should declare it as being a const method.

If the state of uncertainty is resolved by the compiler as false, thus the variable or method being a non-constant, we either find a bug more easily, or we have just been informed, that const is really not an option for the specified variable or method.

In general, we need to find a consistent way of using parameters as const T& references or just by value. My rule of thumb is to write elementary value types, such as int, double or char by value, and to use constant references for all other parameters (especially ones like std::string). If modification (by reference) is required, then a non-const reference should be considered. Finally out parameters (if really required), could be taken as pointers, however, only if the disposal is determined.

String literals ftw!

One of the most useful and powerful features of C has been the string (excuse me, I mean const char*) literal. C++ just followed this path and changed nothing from the original behavior. This is definitely okay, if one views C++ from a historical point of view, but a big problem today. Today we do not only want to limit ourselves by only supporting standard ASCII characters. Instead we go all-in on unicode, especially UTF-8. These characters are basically 1-byte, with the extension of offering a shift-bit, which determines if another byte is needed for retrieving the full character code. This procedure can be extended up to a maximum length of 4 bytes.

C++11 now strenghens various encodings by bringing in new ways of defining string literals by prefixes. Let's consider the following examples:

const auto a = u8"I'm a UTF-8 string.";
const auto b = u"This is a UTF-16 string.";
const auto c = U"This is a UTF-32 string.";

But this is only half of the story. In order to conveniently include codepoints into a literal, one might use escape sequences such as \u2018 (4 hex), or even \U00002018 (8 hex). This is also a great addition to the existing (long) string literal L"...".

Another great addition is the raw string literal, which will be discussed together with custom literals in an upcoming episode.

As a final bit to this section I'd like to denote that C++14 will provide another string literal, suffixed with a s to transform string literals directly to std::string (super useful in my opinion).

#include <string>
#include <iostream>

void test(bool a) {
	std::cout << "Called bool" << std::endl;
}

void test(std::string a) {
	std::cout << "Called string" << std::endl;
}

int main() {
	test("123");
	test("123"s);
}

While the first function call will result in "Called bool" being printed, the second one will then result in "Called string" (which would be the expected outcome, at least by most programmers). Therefore: Less confusion, more reliable and much easier to handle.

Created 7/21/2014 3:35:07 PM +00:00. Last updated 8/5/2014 9:45:03 AM +00:00.