Quality & Testing

Quality Radar

Quality Radar is a self-assessment tool for teams building digital products. It plots the activities that drive quality across four kinds of work and three kinds of signal — so you can see what you invest in, and where you are blind.

It is also the most elegant way to describe what we actually sell. We can help you get these things in order, either one at a time or as larger undertakings.

How to read this radar A quick guide to the quadrants, rings, and running it with your team.

What this is

The Quality Radar is a self-assessment tool for teams building and running digital products and services. It plots the activities that have a positive impact on quality across two dimensions: what kind of activity it is (the four quadrants) and what kind of quality signal it produces (the three rings). The point is not to score or judge the team — it is to see what is possible and decide what to invest in next.

What the quadrants and rings mean

Quadrants — the kind of activity

Scripted — Pre-defined, repeatable checks. Automated tests, API contract tests, CI gates. Strong at catching known regressions, blind to what you didn't think to ask.
Exploratory — Investigative, judgement-driven testing by a human. Session-based exploration, bug hunts, charter-based testing. Strong at finding what scripts miss.
Observable — What the system tells you once it's running. Logging, metrics, traces, real-user monitoring, error tracking. Quality signals from production, not from a test.
Collaborative — Quality that emerges from how the team works. Example mapping, the power of three, pairing, code review, design crits.

Rings — who the signal serves

Internal — The factory floor. Code quality, developer speed, build stability.
External — The product. Functionality, performance and reliability as the user sees it.
Brand — Value and trust. The long-term perception, security and emotional resonance of the product in the market.

How to run it with your team

Gather the right people. Software engineers, testers, product and design people.
List your activities. Everything you do today that contributes to quality. Don't filter yet.
Place each one. Pick the quadrant it fits, then the ring (Internal / External / Brand).
Look at the shape. Empty quadrants and thin rings are conversation starters, not failures. Where do you over-invest? Where are you blind?
Pick one or two moves. Add an activity, drop one, or shift weight — then re-run the radar next quarter.

Scripted

Internal

Defining the conditions used to evaluate whether a feature's behavior meets the intended goal.
Testing interface elements in isolation to observe how they handle different inputs and edge cases.
Studying and applying proven structures to identify the most maintainable path for system growth.
Automated checks to find stylistic inconsistencies and syntax issues that affect codebase maintainability.
Identifying patterns in code that suggest underlying design flaws or potential areas of high technical debt.
Examining source code without execution to identify potential bugs, security flaws, or structural weaknesses.
Testing the communication paths between modules to find data-sharing issues or contract mismatches.
Testing isolated code logic to find immediate functional errors at the lowest level of the application.
Real-time feedback in the editor to identify errors and logic flaws during the coding process.
Ensures that services (APIs) continue to adhere to the 'contracts' they have with their consumers.
Injects small faults into code to measure the 'quality' of your test suite by checking if tests fail as expected.
Automated checks to identify and update insecure third-party libraries (e.g., Snyk, Dependabot).
Generating hundreds of random inputs to find edge cases where logic might break.

External

Testing the installation and configuration process to identify risks during environmental transitions.
Studying a library of reusable elements to identify the most consistent way to build UI features.
Executing test cases by hand to ensure recent changes haven't broken existing functionality.
Observing system behavior under stress to identify bottlenecks, breaking points, and responsiveness limits.
Testing legacy code to describe its current actual behavior so that future changes can be measured against a known baseline.
Comparing UI snapshots to identify unintended visual changes or layout shifts introduced by new code.
Exercising the full system flow to observe how integrated components behave in real-world scenarios.
Using automated tools to identify known security gaps and configuration risks within the environment.
Testing the software across various hardware and OS configurations to identify environment-related risks.
Observing application behavior across different browser engines to find platform-specific bugs or inconsistencies.
Testing individual units or the whole system to identify gaps between actual behavior and requirements.
Automated checks for character encoding, date formats, and layout shifts across different languages.

Brand

Probing the system for exploitable paths and security weaknesses by simulating real-world attack patterns.
Investigating how the application behaves for users with disabilities and identifying barriers to WCAG compliance.

Exploratory

Internal

Examining UI/UX plans to identify potential usability flaws or inconsistencies before development begins.
Time-boxed technical investigations aimed at identifying the feasibility or risks of a specific implementation path.
Evaluating system structure to identify long-term scalability risks or technical bottlenecks.
Building low-fidelity versions of features to identify design flaws and learn about technical constraints early.
Investigating individual requirements to identify hidden complexities and potential testing challenges.

External

Time-boxed exploration of specific areas to identify edge cases and unexpected system behaviors.
Collaborative team sessions aimed at uncovering as many diverse bugs as possible in a short period.
A collaborative exercise to identify and prioritize potential quality risks across the system.
Measuring current system response times to identify baselines and potential degradation over time.
Rolling out changes to a small subset of users to identify real-world stability risks before a full launch.
Proactively injecting failures (e.g., latency, server shutdowns) to observe and improve system resilience.
Time-boxed sessions specifically focused on finding corner cases and boundary value failures in new features.
Exploring system limits to see at what point defined reliability targets (Service Level Objectives) are breached.

Brand

Gathering data on the current state of inclusivity to identify where the system fails to meet diverse user needs.
Investigating user behaviors and motivations to identify the most valuable problems to solve.
Observing how different versions of a feature affect user behavior to identify which approach better meets goals.
Observing real users interacting with the system to identify friction points and areas of confusion.
Evaluating the system with target users to identify gaps between the built product and user expectations.
Investigating the architecture to identify potential attack vectors and systemic security risks.
Creative, offensive security exercises to find human and systemic vulnerabilities beyond standard pen testing.

Observable

Internal

Mapping the path of requests through services to identify performance bottlenecks and hidden dependencies.
Assessing the clarity of the codebase to identify risks related to maintainability and future bugs.
Capturing descriptive system events to identify the root cause of failures and unexpected logic paths.
Observing the health of the automation pipeline to identify instability in the build and deployment process.
Analyzing pass/fail rates and coverage to identify the effectiveness and reliability of the testing suite.
Tracking metrics like Time to First PR to identify quality friction within the engineering team.
Monitoring the lifecycle of feature toggles to ensure 'stale' flags do not become technical debt.

External

Testing the observability stack itself to identify "blind spots" where the system might fail silently.
Analyzing aggregated data to identify long-term trends in product performance and user behavior.
Measuring delivery performance (like Lead Time and MTTR) to identify bottlenecks in the software lifecycle.
Consolidating key indicators to identify the overall "vital signs" of the system at a glance.
Providing transparent communication on system state to learn how downtime affects user trust and support load.
Capturing actual user experience data (like load times) to identify performance risks in the wild.
Testing live endpoints to identify latency, breaking changes, or uptime risks for integrated services.
Configuring signals for abnormal behavior to identify emerging issues before they become full incidents.
Running simulated user transactions against production regular intervals to detect downtime before users do.

Brand

Tracking the frequency of system failures to identify stability trends and high-risk areas.
Investigating bugs found in production to identify gaps in the earlier testing phases.
Monitoring resource consumption to identify inefficiencies in system architecture or cloud infrastructure.
Gathering direct user sentiment to identify the perceived quality and value of the product.
Studying documented breaches or near-misses to identify patterns in security weaknesses.
Engaging in direct dialogue to identify how real-world usage differs from intended design.
Learning from internal stakeholders to identify systemic friction in the delivery or support process.
Observing how users navigate the system to identify drop-off points or unexpected usage patterns.
Studying public feedback to identify common pain points and market perceptions of quality.
Measuring the transition of users through key milestones to identify where the product fails to drive engagement.
Investigating the effectiveness of the top-of-funnel experience to identify barriers for new users.
Observing user exit patterns to identify the specific features or behaviors that cause users to leave.
Monitoring allowed 'unreliability' before triggering a release freeze to protect brand trust.
Watching real users in their natural environment to identify 'workarounds' that signal quality gaps.

Collaborative

Internal

Two people working at one station to identify logic flaws and share context in real-time.
The whole team working on the same task simultaneously to identify systemic bottlenecks and align on implementation patterns.
Investigating proposed code changes to identify bugs, architectural misalignments, and opportunities for shared learning.
Specific exploration of interface changes to identify visual regressions and deviations from design standards.
Test-Driven Development. Using tests to specify behavior before coding to identify logic gaps and drive out a simpler design.
Studying shared standards to identify the most consistent and maintainable way for the team to write software.
Request for Comments and Architecture Decision Records. Documenting the "why" behind technical choices to identify the constraints and trade-offs known at the time.
Allowing other teams to contribute code with strict quality gates and collaborative reviews.

External

Assigning clear accountability for a feature to identify who holds the holistic context of its quality and goals.
Maintaining a shared knowledge base to identify how the system is intended to work and where knowledge gaps exist.
Collaboratively investigating upcoming work to identify dependencies, effort, and potential implementation risks.
Looking back at the team's process to identify friction points and learn how to improve collaborative quality.
Establishing shared criteria to identify when work is sufficiently understood to start and when it is high enough quality to ship.
Bringing product, dev, and test together to identify diverse perspectives on a feature's requirements and risks.
Using concrete scenarios to identify the boundaries of a story and uncover hidden business rules.
Testing the path to production to identify risks in how the software is packaged and delivered to users.
Investigating the root causes of incidents to identify systemic failures and learn how to prevent future occurrences.
Visually organizing features to identify the user journey and ensure the most critical quality areas are prioritized.
Collaborative sessions to identify whether proposed interfaces meet user needs and technical feasibility.
Showcasing work-in-progress to stakeholders to identify gaps between the built product and the intended value.
Gathering the whole team (devs, testers, designers) to test a feature together and share quality perspectives.
Sessions to rank features by probability and impact of failure to prioritize testing effort.

Brand

Testing the feedback loop between users and the team to identify how quickly production issues are identified and resolved.
Aligning on which "ilities" (e.g., security, speed) matter most to identify where testing efforts should be focused.
Evaluating the team's response to failure to identify ways to reduce time-to-recovery and improve communication.
Investigating recurring incidents to identify and eliminate the underlying systemic causes.
Sharing data about testing activities to identify the current level of confidence and remaining quality risks.
Aligning on high-level goals and results to identify whether the team’s efforts are moving the needle on quality and value.
Communicating changes to users to identify which new features or bug fixes have been delivered.
Notifying stakeholders of launches to identify potential impacts on the broader business ecosystem.
Investigating market readiness to identify the risks and opportunities of launching at a specific time.
Creating instructional content to identify where the UI might be too complex and requires additional user support.
Visualizing the long-term plan to identify future quality challenges and resource needs.
Strategizing how information is shared during changes to identify and mitigate risks to transparency and trust.
Practicing full system restoration to ensure backup and recovery processes actually work.
Collaboratively mapping touchpoints to identify where quality drops off outside the software itself.

Original radar concept by Nicola Sedgwick, refined by Callum Akehurst-Ryan.