Little wonders of C++ (4)

C++ is transforming into a better language, that is even more powerful than before and much easier and robust to write.

In this part we will go into some best practices and new capabilities. One of these new language features are lambda expressions.

Lambda expressions

Lambda expressions are really neat. People who know them from C# love them. It took some years, but finally Java is also having lambda expressions within the language. Even better, C++ had lambda expressions before Java!

What exactly are lambda expressions? Well, for us lambda expressions present a way of defining functions without giving them a name. This is a so-called anonymous function.

There are two main benefits from using lambda expressions. First, we can define functions in-place, just when we need them. This avoids confusion with unused functions, or what kind of functions to call in certain scenarios. On the other side we are also able to capture local variables.

Capturing local variables is a process that can be easily explained in C#. In C++ the details are a little bit more complicated, but the overall process is the same. A lambda expression is just a class instance with a custom invocation operator (). The code that repesents the body of the lambda expression is then just the body of the invocation operator.

Finally, since we obviously have a class, we can have fields in that class, too. These fields are the captured variables, while the parameters are just the parameters for the invocation operator. Lambda expressions that capture local variables are usually called closures.

Let's have a look at a simple example of a non-capturing lambda:

int main() {
	auto operation = [](double x, double y)->double { return x * y; };
	auto result = operation(2.0, 3.0);//6
}

Granted, the syntax is a little bit more complicated than in C# or Java - but this is C++ after all. The good thing is, that we have total control over the lambda expression. The first term (in square brackets) defines what kind of variables to capture. The second term defines the parameters. Then we have the optional return type. For single statement blocks this is purely optional. Some compilers may even consider this optional when dealing with multiple statements, but it is definitely not guaranteed. Finally we have the body of the lambda.

In C# we can omit the statement block, if we just want to return the result of a single statement. This is not possible in C++. Here we always require a statement block, presented in curly braces.

int main() {
	double factor = 2.0;
	auto operation = [factor](double x, double y)->double { return factor * x * y; };
	factor = 3.0;
	auto result = operation(2.0, 3.0);//12
}

In the previous example we capture the local variable factor. This capturing is done by value, i.e. the value of factor will be copied. Changing the value later will have no effect on the lambda expression.

int main() {
	double factor = 2.0;
	auto operation = [&factor](double x, double y)->double { return factor * x * y; };
	factor = 3.0;
	auto result = operation(2.0, 3.0);//18
}

This example is nearly equivalent, but this time we are capturing the reference to the local variable. Hence we always consider the current value.

The capturing as presented in the above two examples is called explicit. We would be required to name every variable, that we want to capture by reference or by value. Sometimes we just don't care about these details. In these scenarios C++ offers two special kinds of capturing options:

[=] captures all local variable mentioned in the body of the lambda by value
[&] captures all local variable mentioned in the body of the lambda by reference

These two ways can be combined with explicit rules, allowing to specify a rule like like "capture everything by reference, except foo, which should be captured by value".

Of course this is just a rough sketch. There is much more to write about lambda expressions in C++. For instance it is remarkable that the round brackets are optional (if no parameter is given). This yields the perhaps shortest possible lambda expression: [] {}.

Forget function pointers

A function pointers. These miracle workers of indirect invocation. They are responsible for the whole OOP programming, giving us the right function for the corresponding type. But in C++ we usually want to be implicit about them. The compiler should handle polymorphism. And we should also abstract away other usages, such as passing functions as arguments to functions.

A proper way that is certainly considered a best practice is creating a class, overriding the invocation operator and passing around instances of this class. This is also the concept behind delegates in C# - even though one never sees the actual generated class.

What the .NET-Framework team realized: One does not need so many delegates. In the end it just comes down to a few, called Func<T> and Action<T>. I left out the various variations of type arguments. What matters is: We have delegates for functions with a return type and without. And we also have a variation in the number of arguments (from no to n arguments).

well, C++ now also provides a solution for this common problem. The header functional contains the std::function. Since C++ has variadic templates and let's us use void explicitly, this is sufficient for providing a basis for all possible lambda expressions.

#include <functional>
#include <iostream>

using namespace std;

double compute(function<double(double, double)> f) {
	return f(3.0, 2.0);
}

int main() {
	function<double(double, double)> add = [](double x, double y) { return x + y; };
	function<double(double, double)> sub = [](double x, double y) { return x - y; };
	function<double(double, double)> mul = [](double x, double y) { return x * y; };
	function<double(double, double)> div = [](double x, double y) { return x / y; };

	cout << compute(add) << endl;//5
	cout << compute(sub) << endl;//1
	cout << compute(mul) << endl;//6
	cout << compute(div) << endl;//1.5
}

Using the std::function makes great sense, especially since the actual type of a lambda expression is a little bit strange. Also since std::function is a (template) class, we can use it together with capture.

Function pointers still work with lambda expression, but only if no variables have been captured. Here we could write something like:

typedef int (*func)();

int main() {
	func f = []()->int { return 2; };
	f();
}

Prefer vector over list

The STL is a wonderful library that contains really useful container and data types. Whenever we can directly use a type from the STL, we should do it. It will be more standard conform, portable and usually also better performing than before. Also neat extensions such as move constructors, iterators and more are already available.

Two examples of great types are std::vector<T> and std::list<T>. The first one is a dynamic array (similar to List<T> in .NET), i.e. an array with a capacity and a current length, that expands once the current length exceeds the capacity. The second type is a linked list (similar to LinkedList<T> in .NET). One may argue that a linked list is the way to go, but this is actually only true in some scenarios.

Once we find ourselves doing lots of array traversals (going from one end to the next), a linked list is inferior to a really array. The reason is simple: In an array we can just stream from beginning to end. In a linked list, we need to make many jumps, resulting in cache misses, which are very expensive.

There ongoing discussions when to use what. If you are uncertain, then just go with std::vector<T>. This one is superior in random access, which is usually demanded. Only if we want to insert and erase from arbitrary points within the list, a linked list is superior. There is no re-shuffling required, only connection rebuilding on the edges.

Follow RAII

Many of the topics we discuss in this series try to minimize mistakes by hiding implementation details. Such details include raw pointer handling, memory allocation, construction and destruction of objects and more. One can see a pattern emerging here around pointers.

C++ is stack-based. We should make use of that. The actual implementations may lead to the heap, but using objects should be done on the stack. This way we get ensured behavior, that is deterministic and error-free. A pattern that gives us a nice recipe for constructing classes in such a way, that they can be used on the stack, guaranteeing error-free execution is called RAII. RAII stands for Resource Acquisition Is Initialization.

In RAII, holding a resource is tied to object lifetime: resource allocation (acquisition) is done during object creation (specifically initialization). This is invoked by the constructor. On the hand resource deallocation (release) is done during object destruction, i.e. by the destructor. If objects are destructed properly, resource leaks do not occur. We are deterministic and error-free.

RAII is also the origin for useful constructs such as std::shared_ptr<T> or std::vector<T>, etc. But not only memory can be handled safely following this pattern, also resources such as connections, e.g., to a database, handles in general or other objects, that require special closing attention, will benefit from the pattern.

The following example has been taken from Wikipedia. It demonstrates the usage of RAII for file access and mutex locking:

#include <string>
#include <mutex>
#include <iostream>
#include <fstream>
#include <stdexcept>

using namespace std;
 
void write_to_file (const string & message) {
    static mutex mutex;
    lock_guard<mutex> lock(mutex);
    ofstream file("example.txt");

    if (!file.is_open())
        throw runtime_error("unable to open file");
 
    file << message << endl;
}

There are two magic points. One is lock_guard<T> and the other is ofstream. Reading such a code in C# might be weird. Why is the mutex not unlocked? Why is the handle controlling the file not released?

The magic lies indeed in the stack allocation. In such a case C++ will automatically insert destruction function calls, once the scope is left. In the previous example this happens on two occasions:

In case that the file cannot be opened (an exception is thrown here)
Once the block statement of the method ends (after writing the message)

Usually we would be required to handle everything ourselves. In C# we would have a using statement for a similar flow. Now - no matter where we leave the method - we can be sure that the mutex is unlocked and that the file is released.

So in short using the RAII pattern is similar to do the following:

class MyRaii {
public:
	MyRaii() {
		//Acquire resource here
	}

	~MyRaii() {
		//Release resource here
	}
};

Using more than namespaces

Using namespaces in C++ is a familiar concept to most programmers. There are some restrictions and best practices, e.g., not state a `using namespace` declaration globally in a header file or don't use it just to simplify calling a single type, however, this section will go beyond standard usage.

First of all C++ does not like nested namespaces. They are possible, of course, but one should only use nested namespaces to distinguish between internal details and the public API. That being said it is sometimes beneficial to have an alias for a namespace.

namespace nested = this_one::is_deeply::nested;

Also we might already be in a namespace, so we could have reference namespaces by their relative name. In fact all the presented namespace references are always relative. To make a namespace reference absolute, we require the namespace operator as a prefix. For instance:

namespace nested = ::this_one::is_deeply::nested;

Finally we don't need to include everything from a namespace. In fact, while this does not have any performance issues, it may take a little bit longer to compile. Additionally there might be confusion about which function to take.

Consider the following example:

#include <cmath>

using namespace std;

namespace primitives {
	struct Angle {
		double Value;
	};

	Angle sqrt(Angle angle) {
		return Angle { sqrt(angle.Value) };
	}
}

int main() {
	using namespace primitives;
	Angle a { M_PI * 0.5 };
	Angle b = sqrt(a);
}

The code above won't compile. We get the following error: could not convert ‘angle.primitives::Angle::Value’ from ‘double’ to ‘primitives::Angle’. Ouch! But we included cmath. How is this possible?

Well, actually calling the function with its fully qualified name should be enough. So we can solve this problem that way:

Angle sqrt(Angle angle) {
	return Angle { std::sqrt(angle.Value) };
}

This is, however, not the best solution. The best solution would be if we tell C++, where else it could look for an appropriate sqrt function. This can be achieved by including a using declaration. This is not the using directive, that is used together with the namespace keyword.

Angle sqrt(Angle angle) {
	using std::sqrt;
	return Angle { sqrt(angle.Value) };
}

This version is also very useful in combination with templates and other possibilities, where we do not know what specific function should be called. We only know that such a function exists, which has the given name and signature.

Created 8/8/2014 11:48:07 AM +00:00. Last updated 8/8/2014 11:50:23 AM +00:00.