• How do you take an immature experimentation organization – the kind that runs one or two a/b tests a month – and turn them into a booking.com

    This is a question that we – and many of our clients – have been trying to answer for years. 

    We’ve approached  this problem from many different angles. We’ve developed a number of models and frameworks to support us in this work, and we’ve tailored – and implemented – maturation plans for a huge range of clients: for some of the most mature experimentation organizations on the planet, as well as complete newbies. 

    Throughout all of this work – and through our broader work in experimentation – we’ve gradually been able to put together a map of the experimentation maturity landscape. 

    This map, which has become our flagship maturity model, is proving to be an invaluable resource, allowing us to to tell our clients

    1. How mature their experimentation function is relative to the best in the business
    2. Where, specifically, they’re doing well – and where they’re falling down
    3. How they can remedy shortcomings and take meaningful strides towards maturity

    Throughout the remainder of this blog post, we’re going to share  this maturity model with you. 

    …but first:

    There are already tons of existing maturity models kicking about in our industry. Why did we feel the need to develop another one?

  • Contents

  • 1. Not another maturity model: why we felt the need to develop a new maturity model

    A couple of months ago, I was working with an ambitious client to try and mature their experimentation program. 

    According to our 3 V model of experimentation success, this client  was doing everything right:

    • Velocity – the speed from ideation to launch was extremely fast
    • Volume – they were launching lots of experiments each month
    • Value – the experiments they were launching were driving real, demonstrable business growth

    Unfortunately, this 3 V analysis was missing something. 

    While this team was doing lots of things right – they’d matured immensely since we’d started working together –  I knew there was still tons of room for improvement. 

    To give two examples:

    1. The client was using experimentation to drive real business value, but this value was siloed to a couple of teams and had produced zero impact in other areas of the business. The most mature experimentation teams we work with use experimentation to make better decisions across every area of their business. 
    2. The client was using research to inform experiments, but their research was extremely infrequent. The most mature experimentation teams we work with tend to have an ‘always-on’ research mentality, which allows a true Mixed Methods approach to develop. 

    My first port of call in trying to solve this problem was to look to the PACET model that we’d developed many years ago. 

    PACET essentially breaks an experimentation function down into 5 factors – Process, Accountability, Culture, Expertise, and Technology – and attempts to identify and remedy any bottlenecks that are harming program performance. 

    maturity-model-1

    Unfortunately, for all its many strengths, the trouble with PACET is that it doesn’t provide a clear series of stepping stones that an experimentation program can use to benchmark and mature its approach. Put another way, I needed a map – with clear milestones – that I could use to benchmark my client’s maturity and help them level up. 

    The good news is that I work for the world’s leading experimentation agency (!), so I decided to tap into the collective experience of our 40+ strong consulting team to begin building a comprehensive map of the experimentation maturity landscape.  

    After much trial, error, discussion, iteration, refinement, etc., we’ve now arrived at a model that is delivering real value for clients, providing them with specific goals and actions that they are  using to mature their programs at breakneck speed. 

    We’re hopeful that this model can do the same for your program too, so here it is.

  • 2. The five stages of program maturity

    At the highest level, our maturity model breaks down the spectrum of experimentation maturity into five discrete stages. 

    These  stages range from teams that are running the odd ad-hoc test to companies like Duolingo and Microsoft that run thousands of tests each year and use experimentation to inform decisions across every area of the business.

    Here are the 5 stages:

    1. Reactive

    Teams at the reactive stage are characterized by ad hoc, sporadic testing initiated by individuals without strategic direction or leadership support. These organizations typically run occasional experiments focused on low-hanging fruit, with basic A/B tests that lack proper documentation or knowledge sharing.

    There’s no clear program goal or defined KPIs, and ROI tracking hasn’t even crossed their minds. The testing tool stands alone without integration to other data platforms, and research is extremely limited. These teams are essentially testing wherever they can, whenever they can, without any formal framework or stopping protocol.

    The key challenge at this stage is the complete absence of structure – no hypothesis framework, no pipeline of experiments, and crucially, no buy-in from leadership. Results are barely documented or shared, leading to a cycle of “spaghetti testing” where learnings are lost and mistakes are repeated.

    1. Emerging

    At the emerging stage, experimentation begins to gain traction within specific teams or projects. While there’s still no formal framework or strategy, experiments become more regular and organized. Teams start building awareness of testing’s value and demonstrating wins to gain broader support.

    These programs typically have some individuals championing experimentation, but lack official buy-in or formalized processes. Volume and velocity are slower than optimal, but teams are beginning to identify blockers. A backlog starts forming, though without clear prioritization methods.

    The key development here is that KPIs are being assessed and questioned, with ROI showing for some experiments. Research remains sporadic – conducted when time allows or specific questions arise. While results are shared, cross-functional learning remains limited due to the lack of integration between testing tools and data platforms.

    2. Strategic

    Strategic programs represent a significant maturity leap. Experimentation is now recognized as a strategic activity with clear buy-in from senior leadership. A dedicated team leads or governs experimentation efforts, with established frameworks for hypotheses, experiment plans, and summaries.

    These organizations have defined success metrics with a primary KPI closest to the business goal. Experiments align with business objectives and ladder up to an overarching goal. Research is conducted regularly with a cohesive plan, and a culture of experimentation is taking shape.

    The testing tool is integrated with supporting analytics platforms, enabling behavioral analysis and advanced experiments like multi-armed bandits, multivariate tests, and personalization. Teams focus on aligning strategically and establishing standardized processes, with consistent approaches to prioritization and clear stopping protocols.

    3. Integrated

    Integrated organizations have experimentation embedded across the entire company. There’s a company-wide vision with a shared roadmap, and most of the business is empowered to experiment. Strategic goals align with business objectives while pushing the boundaries of established norms.

    Cross-functional collaboration is frequent, with learnings applied across teams. KPIs are clearly defined, tracked regularly, and insights are shared business-wide and acted upon. These companies actively scale their experimentation efforts, understanding that volume and velocity may dip temporarily in favor of more complex tests.

    A constant research loop with clear questions feeds the experimentation pipeline. The rigorous processes ensure excellent documentation of insights and outcomes. The backlog is consistently fed with high-quality experiments, and prioritization is automated and easily utilized. These teams focus on scaling experimentation across the organization.

    4. Optimized

    Optimized organizations represent the pinnacle of experimentation maturity – think Amazon, Netflix, or Booking.com. Experimentation is fundamental to their business model, integrated into everything they do across every channel. There’s a test-and-learn culture at all levels, with every employee empowered to run experiments.

    These companies use sophisticated frameworks, tools, and processes, leveraging AI and automation to speed up and scale. They run experiments in every aspect of business – online and offline – with clear measures balancing learning and earning goals.

    The process is continuously refined, with everyone following and optimizing the delivery process. Insights are well-documented, shared continuously, and feed back into strategy. Research is consistent and constant, bringing in new methodologies. These organizations often outgrow commercial testing tools and build their own, pioneering complex measurement and experimentation approaches. They focus on innovation and competitive advantage through experimentation.

  • 3. The Areas dimensions of maturity

    Now, some of you might have read the preceding section and thought:

    ‘Hollldd up – I feel like we tick some of the criteria for this stage but not others.’

    If this is you, you’re not alone: we found the same for almost all of the clients we used this model with. 

    This is where the maturity areas come in: in essence, we’ve taken the various criteria that define each stage of the model, and we’ve clustered these criteria into four primary areas, which are:

    1. Experiment goals
    2. Delivery and process
    3. Strategy and culture
    4. Data & tools

    Organizations rarely mature evenly – you might be Strategic in experiment goals but only Emerging in data and tools.  By introducing the areas dimension into our model, we’re  able to identify the specific places where each program is falling short. Once we’ve diagnosed weaknesses, we’re then in a much stronger position to begin fixing them.  

    1. Experiment Goals

    This dimension examines what you’re trying to achieve with experimentation. Are experiments random and goalless (Reactive), or do they ladder up to strategic business objectives (Strategic)? The most mature programs have company-wide goals that push boundaries and treat both learning and earning as valuable outcomes.

    2. Delivery and Process

    How efficiently and effectively do you run experiments? This covers everything from tracking velocity through each stage of the experimentation process to having clear frameworks and stopping protocols. Mature programs have rigorous, well-documented processes that everyone follows, with continuous optimization of the process itself.

    For an example of optimizing the optimization process itself, check out this blog post. 

    3. Strategy and Culture

    The cultural dimension is often the hardest to change but also the most impactful. It encompasses leadership buy-in, how widely experimentation is adopted, and whether there’s a true test-and-learn mindset across the business. Advanced programs have experimentation embedded in their DNA, with everyone from C-suite to individual contributors running tests.

    4. Data and Tools

    This covers both the research feeding your experiments and the technical infrastructure supporting them. Mature programs have constant research loops, integrated tech stacks, and advanced testing capabilities. They’ve often moved beyond commercial tools to custom solutions that support their scale and complexity.

    By assessing where you stand on each dimension, you can create targeted improvement plans. For instance, if you’re Strategic in goals but Emerging in tools, you know to focus on tech stack integration and research capabilities.

  • 4. Bringing it all together: how to actually apply this stuff

    Now that you understand the maturity stages and areas, we’re going to finish up this article by sharing the step-by-step process that we’ve been using to help our clients level up their maturity. 

    Here it is:

    Step #1: Honest Assessment

    Gather your experimentation team and stakeholders to evaluate where you currently stand on each area. Use the detailed criteria in this file to score yourselves objectively. Don’t aim for perfection – even being aware of your gaps is valuable progress.

    Step #2: Identify Your Constraints

    Look for the areas where your program is least mature. These are your primary constraints holding back overall program maturity. You can’t jump from Reactive to Optimized overnight, but you can identify the specific blockers preventing you from reaching the next stage.

    Step #3: Create Your Roadmap

    Based on your assessment, identify 2-3 concrete actions that will move you forward in the next 6 months. For example, if you’re Emerging in Data & Tools, you might focus on:

    • Establishing defined KPIs with proper tracking
    • Implementing continuous research practices
    • Integrating your testing tool with analytics platforms

    Here’s an example of 3 actions that we chose to focus on with one of our clients recently:

    Step #4: Set Realistic Timelines

    Most organizations take 3-5 years to move from Reactive to Optimized. Plan for steady progress:

    1. Year 1: Move from Reactive to Strategic, focusing on the fundamentals
    2. Years 2-3: Progress to Integrated, scaling successful practices
    3. Years 3-5: Push toward Optimized, innovating and leading your industry

    Step #5: Regular Reviews

    Reassess your maturity every 6 months. Celebrate progress in specific areas  while identifying new constraints. Remember, maturity isn’t just about running more tests – it’s about building a sustainable system that drives continuous improvement and innovation.

    The key is starting where you are and taking consistent steps forward. Even small improvements in process, culture, or tools can unlock significant value when compounded over time.

  • Thanks for reading! If you’d like to chat about how we can help you improve your program’s maturity – or if you’d just like to chat about experimentation in general! – please do get in touch. We’re passionate about experimentation and always look to share notes. 

    Feel free to reach out to us by filling out our contact us form!