Repository-Bound Documentation

Documentation is very important. However, often we face two problems: Keeping it in sync and making it available.

In a standard development project we have different kinds of documentation systems. Among others we have:

  • Wiki pages, e.g., Confluence
  • Document stores, e.g., SharePoint
  • Chat systems, e.g., Slack
  • Repository-bound documentation, e.g., README

The latter is not directly a system, but I will argue that it should be used like that. The assumption is that the repository is used in conjunction with a version-control system such as git.

I will now argue that multiple systems should be used to store project information. However, the number should be limited to three and they should be linked against each other. In short, I consider the following systems relevant (keep in mind though that this is the general case - depending on the project needs a different combination may be worth considering):

  • Feature information and fixed-state documents should go to a document store
  • Project organization and coordination has to be on a Wiki
  • Living documentation (technical specification that changes and is directly linked to the source code) should be checked in

A system such as Slack or Microsoft Teams could take multiple roles here. Otherwise, these systems are always appreciated for their communication enhancements. This is complementary.

From the previous list we can already infer that the lifetime of the respective documents determines their location. Ones that have a fixed lifetime and don't go directly with source code changes are placed in the document store. Likewise, documents that are only of interest in their current state should be in a Wiki. These will change on a regular basis and form a suitable umbrella to provide general information to all stakeholders. In the most simple form this could also be reduced to a list of links.

Finally, we can have a look at the repository-bound documentation. This one is for the cases that are neither static nor only of interest in their current state (even though that would always be the primary use case). Actually, the only documents that apply to that definition are usually strongly coupled to the source code. Therefore, it makes sense to have them version controlled in the same repository as the associated code.

What advantages does this approach give us? Let's say we have a changelog that gets updated once a new feature has been implemented. With the repository bound approach that changelog will always reflect the state of the code at the point in time I'm looking at. Now a changelog is part of most repositories for a long time, so you may argue that this is trivial, however, the same conclusions may now be drawn with architecture or some other high level documents.

Previously, it was quite a nightmare to map a specific state of the code to the documents of interest. With this approach the pain seems to be gone for good. There are just some rules that need to be followed:

  • Put all relevant documents in a dedicated folder - potentially with a good structure
  • Write the documents in a clear text format - binary files are the enemy of version control
  • Integrate a task in your build system that is able to convert the documents into PDF files or other formats
  • Generate what can be generated

The last point is of particular interest as it gives us yet another advantage over (unfortunately quite often used) Word documents. Big corporations have introduced the concept of versioning in documents. Even though the idea is great it is done completely wrong:

  • The user is just picking a version number
  • Not every change is tracked - only those where the user does not forget
  • Previous stages are lost if the user does not store / distribute the file
  • What's the diff?

Obviously, the second reason may also be argued against VCS in general, however, here the reminders and barriers to forget about it after are just stronger.

What does this have to do with the generation of content? Well, as an example such a history section could be added automatically by using git to read it the file's history.

As a conclusion I see all of the mentioned options being used in larger projects. In my opinion the most important one is the repository bound documentation, which should already be present in smaller projects.

Created .

References

Sharing is caring!