Blog

  • Field Note: Revisiting Quality

    For a long time, I have had a somewhat complicated relationship with the word “quality.” Like many people who started their careers in testing and QA, I gradually became uncomfortable with the label. Not because quality is unimportant, but because the term often seemed too narrow.

    When people hear “quality,” they frequently think about testing. When they hear “testing,” they often think about defects. And when they think about defects, the conversation quickly becomes operational rather than strategic.

    Over time, I found myself increasingly interested in topics that appeared to sit outside the traditional quality domain:

    • Architecture
    • Data
    • Governance
    • Operating models
    • Organizational capability
    • Decision-making

    These seemed like larger and more interesting questions. Or so I thought.

    An Unexpected Observation

    Recently, while working with different kinds of assessments, I started noticing a recurring pattern. The object being assessed varied considerably:

    • A software system
    • A data platform
    • An organizational capability
    • A process
    • An architecture

    Yet the assessment itself always seemed to follow the same structure.

    First, there was an idea of what “good” looked like. Then there was a way of collecting evidence. Then findings were identified. Finally, a judgment was made about fitness, capability, risk or readiness. The terminology changed, but the pattern did not.

    The Question Behind the Question

    At first, I thought I was moving away from quality. Now I am no longer sure. Perhaps I was not moving away from quality at all. Perhaps I was moving away from a narrow interpretation of quality. Traditional quality discussions often focus on products:

    • Does the software work?
    • Does it meet requirements?
    • How many defects remain?

    Important questions, certainly. But they are not the only quality questions.

    We also ask:

    • Is the architecture sustainable?
    • Is the data trustworthy?
    • Is the organization capable?
    • Is the process effective?
    • Is the operation resilient?

    These are quality questions too. We simply tend to classify them differently.

    Quality as Fitness for Purpose

    The more I think about it, the more useful the classic definition of quality becomes:

    Fitness for purpose.

    The phrase is deceptively simple. It does not limit quality to software. It does not limit quality to testing. It does not even limit quality to technology. Anything can be evaluated in terms of fitness for purpose:

    • A system.
    • A dataset.
    • A process.
    • An organization.
    • An operating model.
    • An architecture.

    The assessed object changes, but the underlying question remains remarkably consistent.

    Assessment as the Missing Link

    This realization led me to another thought: perhaps quality is not primarily about testing. Perhaps quality is primarily about assessment. Assessment is the mechanism through which we determine fitness for purpose.

    Testing is one assessment technique. Reviews are another. Interviews are another. Metrics analysis is another. Profiling is another.

    The specific techniques matter less than the purpose they serve. They generate evidence. That evidence supports an evaluation. That evaluation supports a decision.

    Viewed this way, testing becomes part of a much larger family of activities.

    A Wider Perspective

    Ironically, the more I tried to move away from quality, the more frequently I encountered it. Not the operational version of quality, but the broader version. The version concerned with evidence, confidence, risk and fitness for purpose. The version that appears whenever people need to make informed decisions about systems, data, organizations or processes.

    Perhaps quality is not a niche after all. Perhaps it only appears that way when viewed through the lens of testing.

    When viewed through the lens of assessment, quality seems to show up almost everywhere.

    An Open Thought

    I am not yet sure where this line of reasoning leads. It may turn out to be nothing more than a useful way of organizing ideas. Or it may suggest that testing, quality assurance, data quality, architecture reviews and capability assessments are all manifestations of the same underlying pattern. For now, I am content to leave that question open.

    What I find interesting is that an attempt to move beyond quality has unexpectedly led me back to it. Only this time, it looks much bigger than before.

  • Field Note: Assessment Requires Two Forms of Expertise

    One of the most interesting observations from recent assessment work is that high-quality assessments rarely emerge from domain expertise alone. Nor do they emerge from assessment methodology alone.

    The best outcomes seem to arise from the combination of both.

    The Domain Expert

    Every assessment requires people who understand the subject being assessed. They understand the context, the technology the operational realities. They know where the complexity lives.

    Without this expertise, important signals are easily missed. Recommendations risk becoming generic. Conclusions risk becoming detached from reality. Domain expertise provides depth.

    The Assessment Expert

    At the same time, domain expertise alone is often insufficient. Understanding a system is not the same as assessing it. Assessment requires a different set of capabilities:

    • Structuring observations.
    • Evaluating evidence.
    • Identifying patterns.
    • Separating symptoms from root causes.
    • Testing assumptions.
    • Developing findings and recommendations.
    • Maintaining consistency and traceability.

    Assessment expertise provides structure.

    The Limitation of Either Role Alone

    A domain expert working alone may possess deep knowledge but struggle to transform observations into a coherent assessment. The result can become anecdotal, incomplete, or difficult to communicate.

    An assessment expert working alone may apply a rigorous methodology but lack sufficient understanding of the domain. The result can become superficial or disconnected from operational reality.

    Both forms of expertise are valuable. Neither is sufficient on its own.

    A Productive Tension

    What makes the combination powerful is the interaction between the two perspectives. The domain expert challenges the assessment model with reality. The assessment expert challenges the domain expert’s assumptions with evidence and structure.

    One contributes understanding. The other contributes evaluation.

    One asks:

    What is happening?

    The other asks:

    What does it mean?

    The dialogue between the two often produces insights that neither would have reached independently.

    Assessment as a Discipline

    This observation has broader implications. Many organizations treat assessments primarily as a domain activity. As a result, assessments are often led exclusively by subject matter experts.

    Yet assessment itself appears to be a professional discipline. Just as facilitation, coaching, architecture, and auditing require specific skills, assessment requires its own capabilities.

    Methodology matters.

    Evidence matters.

    Reasoning matters.

    Structure matters.

    A Working Hypothesis

    This leads me to a hypothesis:

    The quality of an assessment is determined not only by the quality of domain expertise, but by the quality of the interaction between domain expertise and assessment expertise.

    The strongest assessments do not emerge from either role alone. They emerge from collaboration between people who understand the system and people who understand how to assess it.

    Understanding provides insight. Assessment provides rigor. Both are required.

  • Field Note: Reference Models and Assessment Models

    Recently, I noticed that two concepts are often discussed together but rarely separated clearly: reference models and assessment models.

    At first glance, they seem closely related. The more assessments I perform, however, the more convinced I become that they answer two fundamentally different questions.

    What Good Looks Like

    Whenever we assess something, we implicitly compare reality against a picture of what “good” looks like. In testing, this might include test strategy, test automation, defect management, governance and continuous improvement. In data quality, it might include ownership, controls, monitoring and remediation processes. In architecture, it might include standards, principles and governance.

    Before we can assess anything, we need a view of the desired state. That is the role of the reference model. A reference model describes the capabilities, practices and characteristics that define success in a particular domain.

    It answers the question:

    What does good look like?

    Understanding Reality

    Once we have defined the desired state, a second question emerges. Where are we today?

    This requires a different kind of model.

    • Interviews.
    • Workshops.
    • Evidence collection.
    • Observations.
    • Document reviews.
    • Scoring methods.
    • Analysis techniques.
    • This is the assessment model.

    The assessment model does not define excellence. Instead, it defines how we evaluate reality. It answers the question:

    How do we determine where we are today?

    The Same Structure Everywhere

    What fascinates me is how consistently this pattern appears, in testing, data quality, cybersecurity, agile maturity, organizational capability, leadership development. The domain changes. The structure remains remarkably similar:

    First, define what good looks like. Then assess the current state against that definition.

    Different objects, same underlying mechanism.

    Why the Distinction Matters

    Separating the two models turns out to be surprisingly useful. Without a reference model, assessment findings become subjective. Without an assessment model, the reference model remains theoretical.

    The two complement each other, but they should not be confused. In fact, many disagreements during assessments are not disagreements about findings at all. They are disagreements about one of the underlying models. Sometimes people disagree about what good looks like. Sometimes they disagree about the evidence. Separating the models makes those discussions much easier to navigate.

    A Broader Observation

    The more I think about it, the more I suspect this distinction extends beyond formal assessments. Every evaluation contains some form of reference model. Often it is implicit. Sometimes it is undocumented. Occasionally it is not even shared by the people involved. Yet it is usually there.

    Whenever we judge quality, maturity, effectiveness or success, we are comparing reality against some notion of what good looks like. The assessment simply makes that comparison explicit.

    An Open Reflection

    I used to think of assessments primarily as activities. A sequence of interviews, workshops and analyses.

    Increasingly, I see them as the interaction between two separate models: one model describing the desired state. Another model describing how we evaluate reality.

    The distinction feels obvious once noticed. Yet I find myself returning to it repeatedly. Perhaps because many discussions become clearer the moment the two are separated. And perhaps because the same pattern appears in far more places than I initially expected.

  • Field Note: Assessing Capabilities, Not Processes

    One of the assumptions I have gradually become less convinced of is that organizations should be assessed primarily through their processes. This assumption is deeply embedded in many maturity models. Whether we look at software testing, quality assurance, data management, or other organizational disciplines, the assessment often revolves around questions such as:

    • Are processes documented?
    • Are they followed consistently?
    • Are they measured?
    • Are they continuously improved?

    The underlying logic is straightforward: Better processes lead to better outcomes.

    For a long time, I accepted this logic without much reflection. After all, processes are visible, assessable, and relatively easy to compare across organizations.

    Yet experience has repeatedly shown something different. Some organizations score highly against process-oriented models while struggling to deliver meaningful results. At the same time, other organizations operate with surprisingly lightweight processes and consistently outperform their peers.

    This raises an uncomfortable question:

    What exactly are we assessing?

    From Processes to Capabilities

    Over time, I have found it more useful to think in terms of capabilities rather than processes. A capability describes what an organization is able to do.

    Examples include:

    • Managing quality risks
    • Designing effective test strategies
    • Detecting data quality issues
    • Learning from production incidents
    • Making informed release decisions

    These capabilities exist independently of any particular process.

    Organizations can realize the same capability in very different ways. One organization may rely on formal governance structures, documented procedures, and predefined workflows. Another may achieve similar outcomes through strong collaboration, experienced practitioners, and adaptive working methods.

    The capability is the same. The implementation differs.

    Capabilities and Practices

    This distinction becomes clearer when separating capabilities from practices.

    A capability describes an outcome-oriented ability.

    A practice describes one way of realizing that ability.

    For example, an organization may have the capability to manage quality risks effectively. The practices used to support that capability could include:

    • Risk workshops
    • Risk matrices
    • Exploratory testing charters
    • Failure mode analysis
    • Automated risk detection
    • AI-assisted analysis

    None of these practices are the capability itself. They are merely different ways of expressing it.

    This distinction matters because practices evolve continuously. New techniques emerge, old techniques disappear, and fashionable methods come and go. The underlying capability, however, remains relevant.

    The Limits of Process Assessments

    Many traditional assessments unintentionally reward compliance rather than effectiveness.

    Consider a common assessment question:

    Is there a documented test strategy?

    The existence of a document may provide evidence of organizational discipline. It does not demonstrate that the organization can make sound testing decisions.

    Similarly:

    • A retrospective does not prove learning.
    • A dashboard does not prove insight.
    • A governance process does not prove control.
    • A strategy document does not prove strategic alignment.

    These artifacts may indicate capability. They are not proof of capability.

    The real question is whether the organization can consistently achieve the desired outcome.

    What Should Be Assessed?

    This leads to a different perspective on organizational assessments. Rather than asking whether predefined processes exist, we should ask whether the organization possesses the capabilities required to achieve its objectives.

    Processes remain important. Practices remain important. Artifacts remain important. But they should be treated as evidence rather than objectives.

    The purpose of an assessment is not to determine whether an organization resembles a reference process model.

    The purpose is to understand what the organization is capable of achieving.

    A Working Hypothesis

    This leads me to a hypothesis that increasingly influences how I design assessments:

    Organizations are more meaningfully assessed through their capabilities than through their compliance with predefined process models.

    Processes are valuable. Practices are valuable. But they are ultimately means rather than ends. What matters is not whether an organization follows the prescribed path. What matters is whether it can reliably achieve the outcomes that the path was intended to support.

  • How This Blog Came to Be

    This wasn’t supposed to happen.

    A few weeks ago, all I wanted was a better CV.

    After many years in software development, testing, quality engineering and organisational assessments, my CV had become what many long careers become: a chronological list of projects, responsibilities and technologies. Accurate enough, but not particularly insightful. So I asked an AI to review it.

    I expected editorial assistance. Better wording. Better structure. Perhaps a few suggestions on what to remove or emphasize. Instead, something rather unexpected happened.

    The conversation moved from my CV to my LinkedIn profile. From there, it became a discussion about recurring themes in my career and what seemed to distinguish me from others with similar backgrounds. Then came a SWOT analysis. Then conversations about positioning and professional identity. Then questions about what kind of work I genuinely enjoy, where I add the most value and, perhaps more importantly, where I don’t.

    Somewhere along the way I realised we were no longer talking about my CV at all. We were talking about me. Or more specifically, about the story my career told when viewed as a whole rather than as a sequence of individual jobs.

    The discussion became increasingly reflective. If I have another decade or so left in my professional career, how do I actually want to spend it? What do I want to be known for? What kind of problems do I most enjoy solving Where can I contribute something that isn’t simply another pair of experienced hands?

    What surprised me most wasn’t that AI could rewrite sentences or suggest bullet points. It was that it could synthesise years of information, recognise patterns across them and offer perspectives that I hadn’t previously articulated myself. Sometimes it asked excellent questions. Sometimes it challenged my assumptions. Sometimes it proposed interpretations that immediately resonated.

    Not every suggestion was right. Many weren’t. But enough of them were insightful that they changed the direction of the conversation. And, in a very real sense, they changed the direction of my thinking.

    Eventually, the discussion turned into ideas for articles. Then themes. Then potential series of posts. And eventually, almost without noticing it, I found myself planning a blog. That’s why you’re reading this.

    This won’t be a blog about AI, although AI will undoubtedly appear from time to time. It’s a blog about software delivery, quality engineering, organizational assessments and continuous improvement. It’s about understanding complex systems, identifying patterns and helping organisations become better at building software.

    Ironically, that’s also what happened to me. An exercise that started with improving a CV became an assessment of something much larger: my own experience, strengths, motivations and direction. It also left me with a hypothesis that I suspect will appear repeatedly in future posts.

    The most interesting outcomes don’t necessarily come from humans or AI working in isolation. They come from the interaction between the two. I started the conversation expecting editorial assistance. I ended it with a clearer understanding of my own career and a completely new set of ideas about where to take it next.

  • Start With Why

    Why this blog

    I’ve spent much of my career assessing and improving software delivery organizations. Along the way, I became increasingly interested in the assessment itself: What constitutes evidence? How should confidence be communicated? Are we assessing processes, or capabilities? And can assessment methodology be separated from domain knowledge?

    This blog is a collection of field notes rather than finished theories. It’s where I explore ideas, question assumptions, and try to develop clearer concepts based on practical experience and ongoing reflection.

    While many examples come from software quality and testing, my interest is broader: evidence-based assessment of organizational capabilities and the models we use to understand them.


    Why “Field Notes”

    These aren’t polished frameworks or universal truths. They’re observations from practice, hypotheses in development, and attempts to make sense of recurring patterns I’ve encountered over the years. Some ideas will mature. Others will be discarded. The purpose is not to be right, but to make my thinking visible and open to scrutiny.