Modernize Your C# Code - Part IV: Types

Want to modernize your C# codebase? Let's finish with types.

Modernizing C# Code

Table of Contents

Introduction

In the recent years C# has grown from a language with exactly one feature to solve a problem to a language with many potential (language) solutions for a single problem. This is both, good and bad. Good, because it gives us as developers freedom and power (without compromising backwards compatibility) and bad due to the cognitive load that is connected with making decisions.

In this series we want to explore what options exist and where these options differ. Surely, some may have advantages and disadvantages under certain conditions. We will explore these scenarios and come up with a guide to make our life easier when renovating existing projects.

This is part IV of the series. You can find part I, part II, as well as part III on the CodeProject.

Background

In the past I've written many articles particularly targetted at the C# language. I've written introduction series, advanced guides, articles on specific topics like async / await or upcoming features. In this article series I want to combine all the previous themes in one coherent fashion.

I feel its important to discuss where new language features shine and where the old - let's call them established - ones are still preferred. I may not always be right (especially, since some of my points will surely be more subjective / a matter of taste). As usual leaving a comment for discussion would be appreciated!

Let's start off with some historic context.

Classic Type Systems

Historically, types have been introduced to tell the developers how many bytes by an allocation will be reserved. Additionally, simple things such as additions could then be also figured out by the compiler. Earlier, in pure assembler developers had to decide what operation fits to the given values. Now, the compiler was capable of knowing that not only 4 bytes have been reserved by the two values, but also that they should be treated like integers. An Int32 addition would apply.

Later on the need to introduce custom types was communicated. The first structures have been born. While standard operations (from a machine point of view) may not make much sense, the allocation and structure (hence the name) was key. Not only could we have "names" (usually erased at compile-time, i.e., only known to the compiler for convenience of the developer), but all the parts have been properly specified by position and type.

With the introduction of object-oriented programming and its first interpretations we have seen much more importance on the concept of types (mostly associated then with classes) and their relations to functions (then called methods). The relevance of type information inspection / access at runtime increased leading to capabilities like reflection. While classic native systems usually have very limited runtime capabilities (e.g., C++) managed systems appeared with vast possibilities (e.g., JVM or .NET).

Now one of the issues with this approach today is that many types are no longer originally coming from the underlying system - they come from deserialization of some data (e.g., incoming request to a web API). While the basic validation and deserialization could be coming from a type defined in the system, usually it comes just from a derivation of such a type (e.g., omitting certain properties, adding new ones, changing types of certain properties, ...). As it stands duplication and limitations arise when dealing with such data. Hence the need for dynamic programming languages, which offer more flexibility in that regard - at the cost of type safety for development.

Every problem has a solution and in the last 10 years we've seen new love for the type systems and type theory appearing all over the place. Popular languages such as TypeScript bring the results of years of research and other (more exotic) programming languages to the mainstream. Hopefully, some of the more classic and historic programming languages are also able to learn from these advancements.

Dissecting C#'s Type System

This could also be called .NET's type system, however, while there is certainly some common base layer coming from .NET many constructs and possibilities just come from the language. In a different aspect if we look how F# uses .NET's type system we know that there is no natural limitation given by .NET - the system can bend and extended by a far margin.

C# likes to work with a static type system. And the word static here means something. Let's pretend we have the following type:

public class Person
{
	public string FirstName { get; set; }

	public string LastName { get; set; }

	public DateTime Birthday { get; set; }
}

What if we want to enforce all properties to be optional? Well, actually in some sense they are already as no one forces us to set them. But let's pretend nullable types have been introduced in this article already (they will be later) and what we are after is something like:

public class PartialPerson
{
	public string? FirstName { get; set; }

	public string? LastName { get; set; }

	public DateTime? Birthday { get; set; }
}

Now we have an issue. Once the first class changes we also need to make some change on the second class. What if we could instead write something like:

public type PartialPerson = Partial<Person>;

That's actually how TypeScript works. In TypeScript Partial<T> is just an alias for iterating over all properties and putting an optional (?) on every property.

Alright, so C# does not like this. C# is more runtime oriented. Hence we should use reflection for this.

At runtime this could look as follows:

var PartialPersonType = Partial<Person>();

where the Partial method could be implemented in a straight forward way.

public static Type Partial<T>()
{
	var type = typeof(T);
	var builder = GetTypeBuilder<T>();
	
	foreach (var property in type.GetProperties())
	{
		CreateProperty(builder, property);
	}

	return builder.CreateType();
}

private static TypeBuilder GetTypeBuilder<T>()
{
	var name = $"Partial<{typeof(T).Name}>";
	var an = new AssemblyName(name);
	var assemblyBuilder = AppDomain.CurrentDomain.DefineDynamicAssembly(an, AssemblyBuilderAccess.Run);
	var moduleBuilder = assemblyBuilder.DefineDynamicModule("MainModule");
	return moduleBuilder.DefineType(name,
			TypeAttributes.Public |
			TypeAttributes.Class |
			TypeAttributes.AutoClass |
			TypeAttributes.AnsiClass |
			TypeAttributes.BeforeFieldInit |
			TypeAttributes.AutoLayout,
			null);
}

private static void CreateProperty(TypeBuilder tb, PropertyInfo property, string? ignore = null)
{
	var propertyName = property.Name;
	var propertyType = property.PropertyType;
	var attributes = property.Attributes;
	var customAttributes = property.CustomAttributes;
	var addNullable = false;
	
	if (propertyType.IsInterface || propertyType.IsClass)
	{
		// we require the custom attribute
		addNullable = true;
	}
	else if (propertyType.IsValueType && (!propertyType.IsGenericType || propertyType.GetGenericTypeDefinition() != typeof(Nullable<>)))
	{
		// for values there is no attribute but the Nullable type
		// we only apply it if its not yet wrapped in such a type
		propertyType = typeof(Nullable<>).MakeGenericType(propertyType);
	}
	
	var fieldBuilder = tb.DefineField("_" + propertyName, propertyType, FieldAttributes.Private);
	var propertyBuilder = tb.DefineProperty(propertyName, attributes, propertyType, null);

	foreach (var customAttribute in customAttributes)
	{
		// Append all custom attributes (as beforehand) except the Nullable one
		if (customAttribute.Constructor.ReflectedType.Name != "NullableAttribute")
		{
			AppendAttribute(customAttribute, propertyBuilder);
		}
	}
	
	// if the nullable attribute should be added we can abuse some magic ...
	if (addNullable)
	{
		var customAttribute = MethodBase.GetCurrentMethod().GetParameters().Last().CustomAttributes.Last();
		AppendAttribute(customAttribute, propertyBuilder);
	}
	
	var getPropMthdBldr = tb.DefineMethod("get_" + propertyName, MethodAttributes.Public | MethodAttributes.SpecialName | MethodAttributes.HideBySig, propertyType, Type.EmptyTypes);
	var getIl = getPropMthdBldr.GetILGenerator();

	getIl.Emit(OpCodes.Ldarg_0);
	getIl.Emit(OpCodes.Ldfld, fieldBuilder);
	getIl.Emit(OpCodes.Ret);

	var setPropMthdBldr = tb.DefineMethod("set_" + propertyName,
		MethodAttributes.Public | MethodAttributes.SpecialName | MethodAttributes.HideBySig,
		null, new[] { propertyType });

	var setIl = setPropMthdBldr.GetILGenerator();
	var modifyProperty = setIl.DefineLabel();
	var exitSet = setIl.DefineLabel();

	setIl.MarkLabel(modifyProperty);
	setIl.Emit(OpCodes.Ldarg_0);
	setIl.Emit(OpCodes.Ldarg_1);
	setIl.Emit(OpCodes.Stfld, fieldBuilder);
	setIl.Emit(OpCodes.Nop);
	setIl.MarkLabel(exitSet);
	setIl.Emit(OpCodes.Ret);

	propertyBuilder.SetGetMethod(getPropMthdBldr);
	propertyBuilder.SetSetMethod(setPropMthdBldr);
}

private static void AppendAttribute(CustomAttributeData customAttribute, PropertyBuilder propertyBuilder)
{
	var args = customAttribute.ConstructorArguments.Select(m => m.Value).ToArray();
	var cab = new CustomAttributeBuilder(customAttribute.Constructor, args);
	propertyBuilder.SetCustomAttribute(cab);
}

While its certainly possible to create types on the fly as shown, the fact remains that this is a runtime mechanism. As such many of the potential use cases for type transformation are either a lot harder to accomplish, or impossible.

There are, however, community projects such as Fody to manipulate assemblies and / or IL code for adding such things already at compile-time. The major issue with these is the compiler assistance / tooling. It's often not so easy to see what's really going on.

The Modern Way

A type remains a type. But wait! There is a little bit more to it. We have a lot of capabilities that either come with C# directly, with .NET, or are given by the ecosystem. In this section we'll try to explore all of them.

Actually, while many of the syntax used in C# directly goes to some IL code or code constructs, some (mostly newer, but also as we will see really old) parts of C# tend to work closely together with the type system on a natural level. They either use existing interfaces, types, or other elements - sometimes creating new types without us even realizing. We already saw for instance classes for delegates (e.g., a Func<T>) or local functions being created. Let's see what else is available!

Generating Iterators

Since the first versions of C# we are able to generate types on the fly. Using yield we have the power to start our own iterator. Such an iterator is represented by a type that implements the IEnumerable interface. It turns out that the only thing to do here is to somehow create an IEnumerator instance. All the logic (and state) is then contained in the IEnumerator instance.

Let's first code our own implementation. What we want is an enumerable of the first three numbers (1, 2, 3).

class MyEnumerable : IEnumerable<int>
{
	public IEnumerator<int> GetEnumerator() => new MyIterable();

	IEnumerator IEnumerable.GetEnumerator() => this.GetEnumerator();

	class MyIterable : IEnumerator<int>
	{
		public int Current => current;

		object IEnumerator.Current => current;
		
		private int current = 0;

		public void Dispose() {}

		public bool MoveNext() => ++current < 4;

		public void Reset() => current = 0;
	}
}

The C# language has plenty of nice features to deal with enumerators. Certainly the most used is the foreach loop construct:

var enumerable = new MyEnumerable();

foreach (var item in enumerable)
{
	item.Dump(); // 1, 2, 3
}

Obviously, this one is just syntax sugar for the following code:

var enumerable = new MyEnumerable();
var iterable = enumerable.GetEnumerator();

while (iterable.MoveNext())
{
	var item = iterable.Current;
	item.Dump();
}

A quick comparison at the MSIL code confirms this quite easily. The implicit version using the foreach loop looks as follows:

IL_0000:  nop         
IL_0001:  newobj      UserQuery+MyEnumerable..ctor
IL_0006:  stloc.0     // enumerable
IL_0007:  nop         
IL_0008:  ldloc.0     // enumerable
IL_0009:  callvirt    UserQuery+MyEnumerable.GetEnumerator
IL_000E:  stloc.1     
IL_000F:  br.s        IL_0021
IL_0011:  ldloc.1     
IL_0012:  callvirt    System.Collections.Generic.IEnumerator<System.Int32>.get_Current
IL_0017:  stloc.2     // item
IL_0018:  nop         
IL_0019:  ldloc.2     // item
IL_001A:  call        LINQPad.Extensions.Dump<Int32>
IL_001F:  pop         
IL_0020:  nop         
IL_0021:  ldloc.1     
IL_0022:  callvirt    System.Collections.IEnumerator.MoveNext
IL_0027:  brtrue.s    IL_0011
IL_0029:  leave.s     IL_0036
IL_002B:  ldloc.1     
IL_002C:  brfalse.s   IL_0035
IL_002E:  ldloc.1     
IL_002F:  callvirt    System.IDisposable.Dispose
IL_0034:  nop         
IL_0035:  endfinally  
IL_0036:  ret         

For comparison, the explicit version compiles to be:

IL_0000:  nop         
IL_0001:  newobj      UserQuery+MyEnumerable..ctor
IL_0006:  stloc.0     // enumerable
IL_0007:  ldloc.0     // enumerable
IL_0008:  callvirt    UserQuery+MyEnumerable.GetEnumerator
IL_000D:  stloc.1     // iterable
IL_000E:  br.s        IL_0020
IL_0010:  nop         
IL_0011:  ldloc.1     // iterable
IL_0012:  callvirt    System.Collections.Generic.IEnumerator<System.Int32>.get_Current
IL_0017:  stloc.2     // item
IL_0018:  ldloc.2     // item
IL_0019:  call        LINQPad.Extensions.Dump<Int32>
IL_001E:  pop         
IL_001F:  nop         
IL_0020:  ldloc.1     // iterable
IL_0021:  callvirt    System.Collections.IEnumerator.MoveNext
IL_0026:  stloc.3     
IL_0027:  ldloc.3     
IL_0028:  brtrue.s    IL_0010
IL_002A:  ret         

Since the IEnumerator implements the IDisposable interface we should have also disposed the resource correctly. The foreach syntax does that for us. Another reason for always using the generated code - it just makes our life easier by doing the right things without us having to remember.

Still, its not the foreach part that strikes us, but rather the generation of the the class for the IEnumerable / IEnumerator implementation.

Let's use the yield keyword to do that.

static IEnumerable<int> GetNumbers()
{
	var current = 0;
	
	while (++current < 4)
	{
		yield return current;
	}
}

The interesting thing is that this small piece of code already represents the full iterator as specified above. The C# compiler generates all the necessary types for us to make it work. The usage is also the same, except that instead of an explicit constructor call (new MyEnumerable()) we just call the function (GetNumbers()). Great!

Generated IEnumerable

Let's recap what's so great about iterators.

Useful for Avoid for
  • Custom iterations
  • Generalized enumerations
  • State machines
  • Iterating arrays (if known)

Discards

The C# compiler imposes quite some restrictions to the developer. Some of these restrictions are included to safe guard against obvious errors, while others are included to shield the user from running potentially useless code. One of this restrictions forbids to use certain expressions without assignment.

Consider the following code:

static void Main()
{
	2 + 3;
}

Now that's a strange code. It would compute the result of 2 + 3 but it would not do anything with it. In a nutshell, either the compiler would optimize this statement away or we would just waste some CPU cycles.

Personally, I think it's a strange restriction. Yes, the code above would be useless, but since C# allows operator overloading there could be scenarios where simple add expressions would actually have meaningful side effects.

A scenario where the (negative or annoying) implications of this design choice can be seen more practically is a simple nullability test.

static void Run(Action action)
{
	action ?? throw new ArgumentNullException(nameof(action));
	action();
}

This could will not work. Instead, the following does:

static void Run(Action action)
{
	(action ?? throw new ArgumentNullException(nameof(action)))();
}

Call expressions are considered okay by design. Obviously, the side-effect tendency of method calls was regarded very high - especially with respect to the "improbable" rating of operators.

We would, however, still like to keep version 1 as its more readable. For this reason, i.e., to mitigate the consequences of this historic design choice, a special kind of construct was introduced: Discards.

The idea behind discards is simple: Introduce a special variable called _ that can always be assigned to. It can never be read - it is a write-only variable that will be optimized away by the compiler anyway.

Using this variable we can come back to version 1:

static void Run(Action action)
{
	_ = action ?? throw new ArgumentNullException(nameof(action));
	action();
}

That _ is a special kind of construct can be seen on multiple occassions. Let's say we have multiple of these checks:

static void Run<T>(Action<T> action, T arg)
	where T : class
{
	_ = action ?? throw new ArgumentNullException(nameof(action));
	_ = arg ?? throw new ArgumentNullException(nameof(arg));
	action(arg);
}

Obviously, the type of action and arg will most likely be different. In any case the assignment is accepted. The same can be done with unnecessary out parameters:

if (int.TryParse(str, out _))
{
	// parsed successfully, but we don't care about the result
}

Another useful instance is to "fire and forget" tasks. Earlier, I usually introduced an extension method that looks as follows:

public static class TaskExtensions
{
	public static void Forget(this Task task)
	{
		// Empty on purpose; maybe log something?
	}
}

The advantage was that now I could quite easily inform the C# compiler that an used task has been unused on purpose:

public Task StartTask()
{
	// ...
}

public void OnClick()
{
	StartTask().Forget();
}

Using discards we don't need extra extension methods to transport such information. Also, users already know what "will happen" to the task (hint: the answer is nothing).

public void OnClick()
{
	_ = StartTask();
}

We will also use discards in the pattern matching section.

Useful for Avoid for
  • Throwing away information
  • Run any expression
  • "Forgetting" tasks
  • Storing information
  • Side-effect free expressions

Handling Asynchronous Code

We already touched the topic of asynchronous code briefly in the previous section. Since .NET 4 we have the Task type, which is quite handy to tame multiple streams of work. Together with the task parallel library (TPL) and async / await (C# 5 / .NET 4.5) we have a powerful toolbelt that only improved over the years.

But why does async / await require a specific version of .NET? Isn't this just a language feature? Like always (e.g., interpolated strings, tuples) if we require a specific version of the base class library (BCL) we immediately know that some code is generated which uses the types from the BCL. In case of a method being decorated as async it will generate a new class implementing the IAsyncStateMachine interface.

The IAsyncStateMachine interface looks as follows:

public interface IAsyncStateMachine
{
	void MoveNext();

	void SetStateMachine(IAsyncStateMachine stateMachine);
}

Interesting enough it has a MoveNext method just like the IEnumerator interface. In fact, we could use a specialized version of an IEnumerator to write our own async / await implementation. Coming from JavaScript we know that generators (the JavaScript name for the enumerator / yield syntax sugar) have been (ab)used to introduce async / await capabilities before the feature arrived in the language. Even today polyfills still use this (or fall back even one level before that in case generators are not available).

Let's look at a simple example of a method using async and await:

async static Task Run(Func<Task> action)
{
	await action();
}

This little snippet generated a class to look as follows:

Generated IAsyncStateMachine

In MSIL the generated class is then used in the given Run method:

IL_0000:  newobj      UserQuery+<Run>d__1..ctor
IL_0005:  stloc.0     
IL_0006:  ldloc.0     
IL_0007:  ldarg.0     
IL_0008:  stfld       UserQuery<Run>d__1.action
IL_000D:  ldloc.0     
IL_000E:  call        System.Runtime.CompilerServices.AsyncTaskMethodBuilder.Create
IL_0013:  stfld       UserQuery+<Run>d__1.<>t__builder
IL_0018:  ldloc.0     
IL_0019:  ldc.i4.m1   
IL_001A:  stfld       UserQuery+<Run>d__1.<>1__state
IL_001F:  ldloc.0     
IL_0020:  ldfld       UserQuery+<Run>d__1.<>t__builder
IL_0025:  stloc.1     
IL_0026:  ldloca.s    01 
IL_0028:  ldloca.s    00 
IL_002A:  call        System.Runtime.CompilerServices.AsyncTaskMethodBuilder.Start<<Run>d__1>
IL_002F:  ldloc.0     
IL_0030:  ldflda      UserQuery+<Run>d__1.<>t__builder
IL_0035:  call        System.Runtime.CompilerServices.AsyncTaskMethodBuilder.get_Task
IL_003A:  ret         

In short, we instantiate the generated class, store the used arguments (captures) and set the state to be used within the async state machine. The static Create method of the async task method builder is used to construct the associated builder state. Then we run the async task method builder to construct us a task for this. Finally, we return the Task property from the builder.

Needless to say a simpler version of the code above would have been:

static Task Run(Func<Task> action)
{
	return action();
}

These two variants are not exactly equalivant. Beforehand, we returned a newly generated task, "wrapping" the original task. Now, we return the original one. Performance-wise they are certainly not the same. In this version we omit a full class generation. Also the MSIL for running the method is super short in comparison:

lang="msil"> IL_0000: nop IL_0001: ldarg.0 IL_0002: callvirt System.Func<System.Threading.Tasks.Task>.Invoke IL_0007: stloc.0 IL_0008: br.s IL_000A IL_000A: ldloc.0 IL_000B: ret

Obviously, we would only use async methods if we have multiple awaits or complicated structures (e.g., only await in a certain if block). In any other case we should try to go with something lighter. Both, compile-time and runtime, will thank us with faster execution.

This is even more true for wrapping standard items in a task. Consider we create a class that implements an interface demanding the following method:

public Task<Task> GetNameAsync()
{
	// ...
}

If we already know the name we could return it directly, but how to wrap it in a task? Simplest case is decorating the method with async, but instead we could also use use Task.FromResult:

public Task<Task> GetNameAsync() => Task.FromResult("constant name");

As a rule of thumb always look for non-generated solutions first.

At this point we could write about certain benefits, e.g., when to use ConfigureAwait(false) and all the things we are allowed to do with async / await these days (e.g., in try-catch blocks), but I feel that many articles (including my own) did that already. Instead, I want to touch the topic of iterator awaits.

Before C# 8 we had no good way of dealing with asynchronous streams. Awaiting the stream is equivalent to only reacting when the stream has finished. The alternative is to await until the first data comes from the stream. This, however, also does not solve the issue as data is not present. What we want is an implicit loop that may await until certain chunks of data are available. The loop ends when the stream is finished.

All this sounds like a boost in the iterator. Again, the following is not the solution:

foreach (var item in await GetItems())
{
	// ...
}

Potentially, we could wrap the stream resulting in:

foreach (var getNextItem in GetItems())
{
	var item = await getNextItem();
	// ...
}

But if there is no next item? We now place a callback. If there is none we could either receive null or throw an exception. Both scenarios have clear drawbacks. Hence let's go with a custom data type.

foreach (var getNextItem in GetItems())
{
	var state = await getNextItem();

	if (state.Finished)
	{
		break;
	}

	var item = state.Current;
	// ...
}

It's still all a bit messy, especially from a boilerplate point of view. Thus we now have await foreach. This one can be used in conjunction with the new IAsyncEnumerable interface.

public async IAsyncEnumerable<int> GetNumbersAsync()
{
    var current = 0;

    while (++current < 4)
    {
        await Task.Delay(500);
        yield return current;
    }
}

I want to spare you now the details how this is generated (and what is exactly generated), but you can guess its similar to the structures we inspected beforehand. In the end its just the state machine of async / await combined with the iterator.

That all of this is similar can be seen directly by inspecting the async enumerable - we see that this is quite like a "normal" enumerable. It just is now dependent on an async enumerator called IAsyncEnumerator (who would have guessed?):

public interface IAsyncEnumerator<out T> : IAsyncDisposable
{
    T Current { get; }
 
    ValueTask<bool> MoveNextAsync();
}

Great, so how's the syntax sugar for this looking like?

await foreach (var item in GetNumbersAsync())
{
	// ...
}

Wonderful, so now also this gap is closed! The async iterator can be super powerful especially for streams of events.

Useful for Avoid for
  • Complex task logic
  • Taming multiple work streams
  • Handling asynchronous events / streams
  • Wrapping of tasks

Pattern Matching

In recent years the direction of C# has certainly changed a bit. It picked up more and more functional concepts. One of the more interesting concepts is pattern matching. Pattern matching in C# comes in multiple ways, for instance in an improved is operator. Officially, they call it a "pattern expression".

Beforehand, we had to use all kinds of different operators to archieve something like a type transformation with a subsequent check. For classes, we could have used as:

var element = node as IElement;

if (element != null)
{
	// ...
}

But with the new power of the is operator such fragments / temporary variables are no longer necessary. We can just write code that reads well.

if (node is IElement element)
{
	// ...
}

Perfect, right? Boring you say. Alright, so maybe the new switch control structure is more for your taste. Personally, I would have liked to see a new match construct, however, I can see why new reserved keywords have been avoided and I like the progressive approach.

switch (node)
{
	case IElement element:
		// ...
		break;
}

Let's inspect the generated MSIL code for this construct:

IL_0000:  nop         
IL_0001:  ldarg.1     
IL_0002:  stloc.3     
IL_0003:  ldloc.3     
IL_0004:  stloc.0     
IL_0005:  ldloc.0     
IL_0006:  brtrue.s    IL_000A
IL_0008:  br.s        IL_0016
IL_000A:  ldloc.0     
IL_000B:  isinst      IElement
IL_0010:  dup         
IL_0011:  stloc.1     
IL_0012:  brfalse.s   IL_0016
IL_0014:  br.s        IL_0018
IL_0016:  br.s        IL_001E
IL_0018:  ldloc.1     
IL_0019:  stloc.2     // element
IL_001A:  br.s        IL_001C
IL_001C:  br.s        IL_001E
IL_001E:  ret         

Alright, no magic here. It's pretty much the same as if we write:

if (node is IElement)
{
	var element = (IElement)node;
	// ...
}

There are a few subtle differences though. Most notably, we have an explicit cast in the generated MSIL form. Using the previously mentioned pattern expression we would be even closer to the MSIL code generated by using the new switch construct. Thus, we can really say switch is purely syntax sugar to avoid repetition.

Is it just syntax sugar over pattern expressions? Well, at least its nice sweet sugar. Especially, since it comes with extensions. In the context of a switch branch we can use the when keyword to introduce more conditions.

The C# documentation lists a great example:

switch (shape)
{
    case Square s when s.Side == 0:
    case Circle c when c.Radius == 0:
        return 0;
    case Square s:
        return s.Side * s.Side;
    case Circle c:
        return c.Radius * c.Radius * Math.PI;
}

Wonderful - this way we avoid complicated constructs that would need to use goto or local functions for avoiding repetitions.

The C# team even went one step further. Besides explicit types (which would check if a cast is possible) we can also implicitly use the current type. As usual, var is the keyword that triggers type inference.

The official documentation mentions the following example:

switch (shapeDescription)
{
    case "circle":
        return new Circle(2);
    case "square":
        return new Square(4);
    case "large-circle":
        return new Circle(12);
    case var o when (o?.Trim().Length ?? 0) == 0:
        return null;
}    

Hence the specific white-space case uses the implicit type for triggering additional checks using when. As an alternative, we could have written case string o when. Nevertheless, var should be preferred as it will also stand the test of time in case of refactoring. Furthermore, it will transport to the reader "hey I don't want to check the casting here, I just want to introduce more conditions". After all, transporting intentions to the reader is important.

Useful for Avoid for
  • Avoiding casting repetitions
  • Simplifying many branch splitting
  • Bringing together scattered, but related blocks of logic
  • Just replacing simple if statements
  • Merging whole blocks of unrelated logic

Nullable Types

Finally something about types! Potentially, the most important change in years (or ever) in C# development has come. Nullable types!

What? I mean, every class represents a heap allocated object that must be created first and otherwise points to a default address known as "null pointer" or simply null. The null reference exception is potentially the most striking one and it speaks about the age of the language (or framework) that its not covered by default in the type system. Luckily, there are some pretty smart people in the C# language team and they came up with a solution that is both, progressive and fitting.

Previously, we just received some type information from the methods we have been calling. An example would be the following code:

var element = document.QuerySelector("a");
// element is of IElement, but can it be null ?

With nullable types every type T is non-null. This is now in alignment to value types, which require a wrapper be (fake) nullable (T? or Nullable<T>). For reference types no such wrapper exists, however, the information is transported via the metadata instead.

Since this is a quite sensitive feature it needs to be enabled first. The following lines must appear in the csproj file of the project where we want to introduce nullable types.

<LangVersion>8.0</LangVersion>
<Nullable>enable</Nullable>

Now instead of returning just the type we can also decorate it using the question mark to signal a return that is potentially null.

var element = document.QuerySelector("a");
// element is of IElement?, we should introduce checks!

The same holds true for method signatures. Let's consider the following signature:

public void Insert(IElement element);

Using the method with an IElement? instance is not allowed. Instead, we have to introduce type guards.

public void InsertMaybe(IElement? element)
{
	if (!(element is null))
	{
		// type transformed to IElement from IElement?
		Insert(element);
	}
}

Long story short: Nullable makes our life easier by detecting where we require guarding and where not. We should treat nullability violations as errors.

Nullable in a Nutshell

Every method that works in the nullable context will be annoted accordingly with a NonNullTypes attribute. In addition, references that are nullable are also explicitly marked as Nullable. As a result, the C# compiler is capable of inferring the correct usage also from third-party libraries or BCL where no source code is given.

Nullable Metadata Annotations

Besides the generated metadata everything stays as usual. There are no MSIL implications. This is just an upgrade for making all applications more robust and better.

Useful for Avoid for
  • Transporting intentions
  • Warning / code robustness
  • Avoiding too much guarding
  • Trusting non annoted code

Outlook

This has been the last part of the series. It's been a pleasure and a joy to compile this collection together.

With respect to types the evolution of C# seems to not have finished yet. Languages such as F# (CLR based) or TypeScript (with JS transpilation) show what is possible and how. We can expect a lot more improvements in the ecosystem to arrive within the next couple of years.

Conclusion

The evolution of C# has not stopped at the used and generated types. We saw that C# gives us some more advanced techniques to gain flexibility without much help from outside tooling. Nevertheless, the help from external tooling gives us much more possibilities without making many sacrifices.

Personally, I hope that TypeScript's flexible type system can be used as a role model for bringing some advanced compile-time manipulation, creation, and evaluation of types to our tool belt.

Points of Interest

I always showed the non-optimized MSIL code. Once MSIL code gets optimized (or is even running) it may look a little bit different. Here, actually observed differences between the different methods may actually vanish. Nevertheless, as we focused on developer flexibility and efficiency in this article (instead of application performance) all recommendations still hold.

If you spot something interesting in another mode (e.g., release mode, x86, ...) then write a comment. Any additional insight is always appreciated!

History

  • v1.0.0 | Initial Release | 28.09.2019
Created . Last updated .

References

Sharing is caring!