In this article I’m exploring code quality. When programming, people make mistakes and introduce bugs. Such bugs can have all kind of negative consequences ranging from mild annoyance to life threatening situations. Let’s see how we can improve code quality so that we will introduce less bugs when developing new, or reworking existing functionality.
My own programming experience
I wrote a longer overview of my own experience in software development here. Now, I’d like to high-light that during my study in computer science I didn’t learn about software testing. However, I decided to build up my career in software testing. Not because I liked testing very much, but I felt it was a necessary evil due to insufficient quality of software. Still today I feel that the quality of software isn’t at a quality level that I would recommend reducing the effort spent on testing. And I think that I won’t be out of a job for the time being.
Okay, this intro sounds a bit like a rant and I’ll now step down from my soap box. My aim is to have an open discussion with anyone, also experienced professional software developers. You may agree with me and provide me with backup, more arguments, more areas to explore. Or you may disagree with me and tell me where I am wrong. I don’t hope to be right, I hope to learn instead. So especially in this discussion feedback is more than welcome.
The limitation of imperative programming
The programming approach that I’ve seen mostly around me is a variation of imperative programming: describing functionality in a sequence of statements that change the program’s state, usually in terms of variables and data structures. This makes sense, since a computer processes programs this way. Higher order languages are essentially easier to understand abstractions of machine language.
The original BASIC dialects have little capabilities for structured programming. Pascal and also C are procedural languages: structure is provided using blocks, procedures and functions, and local and global data structures. Then we have object oriented languages like C++ and Java. Objects are essentially functions and procedures together with enclosed data structures.
Modern approaches for software development like agile development, extreme programming, Scrum and Test Driven Development (TDD) certainly have their merits in terms of solving various aspects around the software development process. But they don’t really address what I consider one of the main root causes of bugs: unintended and unexpected state changes.
Such state changes is a direct consequence of imperative programming. It is very hard to ensure that all state changes are fully controlled. How can we make sure that a function only affects is own local data structure and not has any side effects regarding global data structures? How can we make sure that the code only makes the changes that we think it is making and nothing else?
Towards a solution: functional programming?
We can limit access of components to ‘foreign’ data structures using encapsulation in objects and split up name spaces. We need to make the code very simple, clear, concise and easy to understand. Design patterns can help us with that. And we need to test… rigorously, over and over again, with every change we are making in the code. This calls for automated testing, connected to clear requirements. TDD is a must in my view.
Is there a way to really avoid this stateful mess and approach programming in a different way? Even before starting my computer science classes I heard things like ‘mathematically proving the correctness of the code.’ Not feeling that confident when it comes to math, this sounded rather intimidating. At the same time, it felt very far from reality.
What was referred to, was functional programming. This is about declaring what a program should do, rather than how. Yes, there are functions, but these are static. They evaluate mathematical functions and avoid change-state and mutable data. The functions are idempotent: every time the function is called with the same input, the same output can be expected.
The order in which the functions are executed is supposed to be less important. Flow control doesn’t exist. Iteration is replaced by recursion (a function calling itself).
With a stateless system a large source of bugs is automatically eliminated. An extra benefit of this programming approach seems to be the fact that such programs are much more suitable for parallel processing. Not only large academic clusters but also our own hardware have multiple cores and threads that can be utilised to a higher degree than with imperative programs.
This capability allows us to start thinking in approaching a problem in an imperative way (the way I’m used to, like -I guess- most professional developers). Then when things are working, we can refactor the code towards a more functional style. This refactoring calls for an approach that I will try to use consequently from now on:
- Define a user story that describes what a function is supposed to do (in an In Order To/As/I want format)
- Define the different scenarios for this function (in a Given/When/Then format)
- Write an automated unit test for each scenario
- Watch the unit test fail (RED)
- Enhance the function so that the unit test will pass (quick and dirty, imperative is okay)
- Execute all automated tests and see them pass (also to make sure that I don’t introduce any unintended side effects, GREEN)
- Refactor the function towards a more functional style
- Execute all automated tests again to verify that the refactoring was successful (or fix the refactoring and repeat this step)
I’m sure that steps 1 to 6 already give a positive quality push to the code I will produce. But I’m curious about what I can do with step 7, especially towards the more functional programming domain.
I’ll explore all this more hands-on in this post.