Little wonders of C++ (3)

C++ is transforming into a better language, that is even more powerful than before and much easier and robust to write.

In this part we will focus on a few things that have been promised in the last two parts. In the beginning we will therefore look at customizing literals.

Custom literal suffixes

In the previous part I introduced (new) string literals. In the presentation I also mentioned the upcoming suffix s to immediately transform a string (const char*) literal to a std::string (basically wrapping the literal in a constructor call).

This is certainly a big step forward, as good C++ code generally tends to use std::string over the (raw) C variant. Reasons for this are multiple. Most importantly allocation and de-allocation of the char* are determined. Also the standard variant also uses the move constructor, basically allowing us to write fast performing, elegant code with ease.

But wait a second. C++ let's us define our own suffix operators (since C++11). So, as long as our compiler is ready, we can just define such a suffix in our code and we are good. There is just one constraint: Our own suffix operators have to start with an underscore (_). This has been introduced to reduce the chance of code breaking later on, as more suffix operators defined by the standard committee will enter sooner or later.

Let's start by defining our own const char* to std::string literal converter:

// Define the operator
std::string operator "" _s(const char* ptr, size_t n) { 
     return std::string(ptr, n);
}
// Use the operator
auto mystr = "Hello World!"_s; //auto is std::string

The signature of the function is quite important. With the wrong signature a suffix operator won't be recognized by the compiler. Even worse, it might also throw a compiler-error.

The return type is arbitrary and does not have any constraints. This means that our own suffixes are probably best used to initialize custom types. This way writing a converter is really neat.

The following signature are allowed:

( const char * ) //string
( unsigned long long int ) //integer
( long double ) //double
( char ) // char
( wchar_t ) //wchar
( char16_t ) //UTF-16 char
( char32_t ) //UTF-32 char
( const char*, std::size_t ) //string
( const wchar_t*, std::size_t ) //wstring
( const char16_t*, std::size_t ) //UTF-16 string
( const char32_t*, std::size_t ) //UTF-32 string

One should, however, due to readability use these operators with care.

Compile time constants with constexpr

Compile-time constants are essential for optimizations. If such constants are placed in the right order, the compiler can do some (trivial) calculations, which then don't need to be done. For us programmers this is handy, as "magic" numbers may therefore need less explanations in form of comments. One just defines a few (well-named) constants, and plugs in the computation. The compiler already performs the computation, which means that we do not have to manually compute the result.

Even better are compile-time constant functions. These functions may operate directly during compile-time, if the arguments are compile-time constants. If the arguments are not compile-time constants, the function is called at runtime. Nevertheless, this is not guaranteed. What is important is the possibility of running the function during compile-time. This can therefore save function calls during runtime, drastically increasing performance.

Now it would be great to just decorate all functions as constexpr. However, this is not possible. Functions that can be decorated as constexpr must fulfill certain requirements. In simple words: they have to be trivial enough to guarantee that the compiler is not running longer than the actual program would be. This constraint basically reduces such functions to use only other constexpr variables and functions. Also only simple types can be used for the data.

constexpr int MeaningOfLife(int a, int b) { 
     return a * b;
}

constexpr int DaysOfWeek = 7;
const int meaningOfLife = MeaningOfLife(6, DaysOfWeek); //Evaluated during compile-time, 42

While const for a variable could result in similar behavior, constexpr now guarantees us evaluation during compile-time.

The best thing with constexpr is that this pretty much replaces (at least one) usage of C macros: Having a defined way of compile-time constants. No more macro magic and more power in the language itself.

Supporting initializer lists

C++ added support for initializer list a long time ago. With initializer lists one could for instance not only allocate a vector with a certain size, but already fill the vector. We already talked about the uniform construction in the last part, which is why I skip the basic syntax in this one.

What is important is to realize that previously there was no way of telling C++ how to use the initializer list. Either C++ knew how to use it for a certain type (e.g., a classic C (or even C++) array, or for instance std::vector<T>), or we have not been able to make use of the initializer list.

With C++11 this changed. Now the following works:

MyCollection<int> array { 2, 4, 8, 11, 15 };

How it this possible? It basically works once we setup a compile to make use of the initializer list. Let's see an example for the code above:

template<typename T>
class MyCollection {
private:
     int _size;
     T* _values;
public:
     MyCollection(std::initializer_list<T> values) :
          _size(values.size()),
          _values(new T[_size]) {
          int index = 0;
          for (auto& value : values)
               _values[index++] = value;
     }

     ~MyCollection() {
          delete[] _values;
     }
};

All we need is a constructor that supports std::initializer_list<T> as single argument. Once we provide such a constructor, it always is preferred to other constructors. This will have consequences, however, usually we only write such a constructor if we really want this behavior.

It should be noted that using std::initializer_list<T> is not limited to constructors. We can also write methods that support this type as argument.

Iterators

Iterators are quite mighty. We already know them quite well from languages such as Java or C#. There an iterator is usually any class that implements a specific interface. Java named this interface Iterable<T>. This one will basically introduce a method to return an Iterator<T>. C# follows this pattern by having IEnumerable<T> and IEnumerator<T> interfaces to represent pretty much the same thing.

But iterators exist in C++ for a long time. They are standardized and usually come from calling a begin() or end() method. The simplest form of an iterator is a pointer.

An iterator in C++ just requires an operation such as the increment (and / or decrement) to be implemented. Also the de-referencing operator has to be implemented. All these operations exist trivially on a pointer object.

std::vector<int> numbers { 1, 2, 3, 4, 5, 6 };
std::vector<int> squares;
for (auto element = numbers.begin(); element != numbers.end(); ++element) {
     auto number = *element;
     squares.push_back(number * number);
}

The previous example will iterate through all the values in the dynamic array numbers. We use the begin() method to get a fine starting point and check if we have already reached the end by comparing with the result of calling the end() method.

To quote Alex Allain: "The concept of an iterator is fundamental to understanding the C++ Standard Template Library (STL) because iterators provide a means for accessing data stored in container classes such a vector, map, list, etc.". It is also crucial for the next point.

Range based loops

Now that we introduced iterators we want to get rid of the clumsy (but highly useful) syntax in the previous example. The for-loop is really lengthy and one can easily understand, that such a useful pattern is used all over the place. Clumsy code everywhere!

Not with C++11! Here we have a great way out: The range based for loop. This construct is similar to foreach in C# and looks more like the range based for loop in Java. In both languages the range based loop works with classes that implement an interface to return an iterator (e.g. an IEnumerator<T> instance), i.e. that implement, e.g., IEnumerable<T> in C#.

What is not known for most people that program C#: The foreach construct does not rely on the interface being implemented. It actually only cares about the functions (usually forced to be implemented by the interface) being available. This is known as duck typing. If it behaves like something that can be iterated, it must be an iterator.

C++ also follows this path. Even worse (somehow), there is no (at least at the moment) class / interface to force certain methods to be implemented. Nevertheless, the only two things we require are begin() and end() methods that return valid iterators.

Let's extend our collection above with the required calls:

template<typename T>
class MyCollection {
private:
     int _size;
     T* _values;
public:
     MyCollection(std::initializer_list<T> values) :
          _size(values.size()),
          _values(new T[_size]) {
          int index = 0;
          for (auto& value : values)
               _values[index++] = value;
     }

     T* begin() {
          return _values;
     }

     T* end() {
          return _values + _size;
     }

     ~MyCollection() {
          delete[] _values;
     }
};

And that's it! Now we can use our own collection in a range based for loop. The following code illustrates how a range based for loop can be used to iterate over an iterator object.

std::vector<int> numbers { 1, 2, 3, 4, 5, 6 };
std::vector<int> squares;
for (auto number : numbers)
     squares.push_back(number * number);

We saved a line and some characters. We also made the code more meaningful and robust. Isn't that great? Also we can use it with our own collection:

MyCollection<int> array { 2, 4, 8, 11, 15 };

for (const auto& value : array)
     std::cout << "The value is : " << value << std::endl;

If a const type is used, the const version of begin() / end() will be used. In this case, they are not available, so the compiler cannot use these. In general we should also provide such methods, resulting in 4 (instead of 2) additional methods.

Created 8/2/2014 4:55:10 PM +00:00. Last updated 8/2/2014 6:21:59 PM +00:00.